LGMLApr 19, 2022

Imbalanced Classification via a Tabular Translation GAN

arXiv:2204.08683v1h-index: 44
Originality Incremental advance
AI Analysis

This addresses class imbalance in tabular data for predictive modeling, but it is incremental as it builds on existing GAN and oversampling techniques.

The paper tackles binary classification with severe class imbalance by using a GAN-based model with regularization losses to translate majority samples into synthetic minority samples near the class boundary, improving average precision over alternative methods on tabular datasets.

When presented with a binary classification problem where the data exhibits severe class imbalance, most standard predictive methods may fail to accurately model the minority class. We present a model based on Generative Adversarial Networks which uses additional regularization losses to map majority samples to corresponding synthetic minority samples. This translation mechanism encourages the synthesized samples to be close to the class boundary. Furthermore, we explore a selection criterion to retain the most useful of the synthesized samples. Experimental results using several downstream classifiers on a variety of tabular class-imbalanced datasets show that the proposed method improves average precision when compared to alternative re-weighting and oversampling techniques.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes