LGMLMar 22, 2020

Deep Synthetic Minority Over-Sampling Technique

arXiv:2003.09788v140 citations
AI Analysis

This addresses the problem of unstable classification results for users dealing with imbalanced datasets, but it is incremental as it builds on the widely used SMOTE method.

The paper tackles the instability of the Synthetic Minority Over-sampling Technique (SMOTE) in imbalanced classification by adapting it into a deep learning architecture, resulting in Deep SMOTE, which outperforms traditional SMOTE in precision, F1 score, and AUC in most test cases.

Synthetic Minority Over-sampling Technique (SMOTE) is the most popular over-sampling method. However, its random nature makes the synthesized data and even imbalanced classification results unstable. It means that in case of running SMOTE n different times, n different synthesized in-stances are obtained with n different classification results. To address this problem, we adapt the SMOTE idea in deep learning architecture. In this method, a deep neural network regression model is used to train the inputs and outputs of traditional SMOTE. Inputs of the proposed deep regression model are two randomly chosen data points which are concatenated to form a double size vector. The outputs of this model are corresponding randomly interpolated data points between two randomly chosen vectors with original dimension. The experimental results show that, Deep SMOTE can outperform traditional SMOTE in terms of precision, F1 score and Area Under Curve (AUC) in majority of test cases.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes