Random Walk-steered Majority Undersampling
This addresses the problem of class imbalance for machine learning practitioners, offering an incremental improvement in undersampling techniques.
The paper tackles class imbalance in datasets by proposing Random Walk-steered Majority Undersampling (RWMaU), which uses random walks to identify and undersample majority points close to the minority class, resulting in substantial performance improvements over competing methods on 21 datasets and 3 classifiers.
In this work, we propose Random Walk-steered Majority Undersampling (RWMaU), which undersamples the majority points of a class imbalanced dataset, in order to balance the classes. Rather than marking the majority points which belong to the neighborhood of a few minority points, we are interested to perceive the closeness of the majority points to the minority class. Random walk, a powerful tool for perceiving the proximities of connected points in a graph, is used to identify the majority points which lie close to the minority class of a class-imbalanced dataset. The visit frequencies and the order of visits of the majority points in the walks enable us to perceive an overall closeness of the majority points to the minority class. The ones lying close to the minority class are subsequently undersampled. Empirical evaluation on 21 datasets and 3 classifiers demonstrate substantial improvement in performance of RWMaU over the competing methods.