LGOct 23, 2022

Imbalanced Class Data Performance Evaluation and Improvement using Novel Generative Adversarial Network-based Approach: SSG and GBO

arXiv:2210.12870v14 citationsh-index: 25
Originality Incremental advance
AI Analysis

This addresses a common data preprocessing problem in machine learning for practitioners, though it appears incremental as it builds on existing GAN and SMOTE methods.

The paper tackles class imbalance in datasets by proposing two novel oversampling techniques (SSG and GBO) that combine GANs with SMOTE, showing they outperform original SMOTE on eight benchmark datasets and generate Gaussian-distributed minority samples.

Class imbalance in a dataset is one of the major challenges that can significantly impact the performance of machine learning models resulting in biased predictions. Numerous techniques have been proposed to address class imbalanced problems, including, but not limited to, Oversampling, Undersampling, and cost-sensitive approaches. Due to its ability to generate synthetic data, oversampling techniques such as the Synthetic Minority Oversampling Technique (SMOTE) is among the most widely used methodology by researchers. However, one of SMOTE's potential disadvantages is that newly created minor samples may overlap with major samples. As an effect, the probability of ML models' biased performance towards major classes increases. Recently, generative adversarial network (GAN) has garnered much attention due to its ability to create almost real samples. However, GAN is hard to train even though it has much potential. This study proposes two novel techniques: GAN-based Oversampling (GBO) and Support Vector Machine-SMOTE-GAN (SSG) to overcome the limitations of the existing oversampling approaches. The preliminary computational result shows that SSG and GBO performed better on the expanded imbalanced eight benchmark datasets than the original SMOTE. The study also revealed that the minor sample generated by SSG demonstrates Gaussian distributions, which is often difficult to achieve using original SMOTE.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes