Gamma SMOTE
This SMOTE variant, known as Gamma SMOTE, leverages a gamma distribution for generating synthetic instances in the minority class of imbalanced datasets. The algorithm achieves this by interpolating between existing instances and their nearest neighbors, utilizing the characteristics of a gamma distribution. The pivotal parameters in this process are alpha and beta, which govern the shape and scale of the gamma distribution.
Importing SMOTE
Creating a synthesizer
Parameters
dataset
required
pd.Dataframe
Represents a pandas data frame containing both, the original minority and the original majority data.
minority_column_label
required
string
Represents the column label. Eg. 'class', 'output'
minority_class_label
required
string
Represents the minority class label. Eg. '1', '0'
Returns
An instance of class Gamma_SMOTE.
Synthetic Data Generation
data_generator()
The data generator function generates synthetic data using the synthesizer. It has an option parameter num_to_synthesize.
If
num_to_synthesize
is not defined, data_generator by default generates the `n` number of synthetic data such that the majority data and minority datasets get balanced.If
num_to_synthesize
is defined and the dataset is already balanced, data_generator generates 2 * Number of original minority data.If
num_to_synthesize
is defined and the dataset is not balanced, it generates synthetic data equal to the value passed.
Parameters
num_to_synthesize
optional (default: None)
integer
Represents the number of synthetic data to be generated.
Returns
A pandas data frame that combines original data and synthetic data.
Usage
synthesizer.data_generator()
synthesizer.data_generator(num_to_synthesize=100)
Last updated