Balanced Autoencoder
The balanced autoencoder is the simplest form of autoencoder. It consists of an equal number of layers at the encoder and the decoder part. The number of neurons in the encoder mirrors in the decoder.
Importing Balanced Autoencoder
Creating a synthesizer
Parameters
dataset
required
pd.Dataframe
Represents a pandas data frame containing both, the original minority and the original majority data
minority_column_label
required
string
Represents the column label. Eg. 'class' , 'output'
minority_class_label
required
string
Represents the minority class label. Eg. '1' , '0'
Returns
An instance of class BalancedAutoencoder.
Network Architecture
Below is the network architecture for the Balanced Autoencoder. Here, the encoder
has two layers, with 22 and 20 nodes respectively. The bottleneck is a single Dense layer with 16 nodes. The decoder
has two layers with 20 and 22 nodes respectively. The decoder layer activation is sigmoid
.
encoder_dense_layers
22, 20
bottle_neck
16
decoder_dense_layers
20, 22
decoder_activation
sigmoid
Synthetic Data Generation
data_generator()
The data generator function generates synthetic data using the synthesizer. It has an option parameter no_of_syntetic_data.
If no_of_syntetic_data is not defined, data_generator by default generates the `n` number of synthetic data such that the majority data and minority datasets get balanced.
If no_of_syntetic_data is defined and the dataset is already balanced, data_generator generates 2 * Number of original minority data.
If no_of_syntetic_data is defined and the dataset is not balanced, it generates synthetic data equal to the value passed.
Parameters
no_of_syntetic_data
optional (default: None)
integer
Represents the number of synthetic data to be generated
Returns
A pandas data frame that combines original data and synthetic data.
Usage
synthesizer.data_generator()
synthesizer.data_generator(no_of_syntetic_data=100)
synthesize_data returned from data_generator( ) adds a column `synthetic_data` to the data frame.
df['synthetic_data'] = 0 : For original data df['synthetic_data'] = 1 : For Synthetic generated data
Last updated