LGMLAug 21, 2020

Counterfactual-based minority oversampling for imbalanced classification

arXiv:2008.09488v24 citations
AI Analysis

This addresses the challenge of generating effective minority samples in imbalanced classification, which is crucial for improving model performance in domains like fraud detection or medical diagnosis, though it appears incremental as it builds on existing oversampling techniques.

The paper tackles the problem of oversampling in imbalanced classification by proposing a counterfactual-based framework that uses majority class information to generate minority samples near the decision boundary, resulting in significant outperformance over state-of-the-art methods on benchmark datasets.

A key challenge of oversampling in imbalanced classification is that the generation of new minority samples often neglects the usage of majority classes, resulting in most new minority sampling spreading the whole minority space. In view of this, we present a new oversampling framework based on the counterfactual theory. Our framework introduces a counterfactual objective by leveraging the rich inherent information of majority classes and explicitly perturbing majority samples to generate new samples in the territory of minority space. It can be analytically shown that the new minority samples satisfy the minimum inversion, and therefore most of them locate near the decision boundary. Empirical evaluations on benchmark datasets suggest that our approach significantly outperforms the state-of-the-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes