ANVO SMOTE
The ANVO algorithm, a variant of SMOTE, targets class imbalances by enriching the minority class with synthetic data. It assesses the distribution and relationships within the minority class to create new, meaningful samples. By adapting the neighbor analysis based on class density, ANVO ensures the generated instances enhance the diversity and coverage of the minority class in underrepresented areas of the feature space. This approach aims to balance the dataset, improving the performance of machine learning models by providing a more representative sample distribution.
Importing SMOTE
Creating a synthesizer
Parameters
dataset
required
pd.Dataframe
Represents a pandas data frame containing both, the original minority and the original majority data.
minority_column_label
required
string
Represents the column label. Eg. 'class', 'output'
minority_class_label
required
string
Represents the minority class label. Eg. '1', '0'
Returns
An instance of class SDD_SMOTE.
Synthetic Data Generation
data_generator()
The data generator function generates synthetic data using the synthesizer. It has an option parameter num_to_synthesize.
If
num_to_synthesize
is not defined, data_generator by default generates the `n` number of synthetic data such that the majority data and minority datasets get balanced.If
num_to_synthesize
is defined and the dataset is already balanced, data_generator generates 2 * Number of original minority data.If
num_to_synthesize
is defined and the dataset is not balanced, it generates synthetic data equal to the value passed.
Parameters
num_to_synthesize
optional (default: None)
integer
Represents the number of synthetic data to be generated.
Returns
A pandas data frame that combines original data and synthetic data.
Usage
synthesizer.data_generator()
synthesizer.data_generator(num_to_synthesize=100)
Last updated