SetConv: A New Approach for Learning from Imbalanced Data
This addresses the issue of imbalanced data in real-world classification tasks like sentiment analysis, offering a novel solution.
The paper tackles the problem of classification bias towards the majority class in imbalanced data by proposing SetConv and episodic training to extract class representatives, achieving superior performance on large-scale text benchmarks compared to SOTA methods.
For many real-world classification problems, e.g., sentiment classification, most existing machine learning methods are biased towards the majority class when the Imbalance Ratio (IR) is high. To address this problem, we propose a set convolution (SetConv) operation and an episodic training strategy to extract a single representative for each class, so that classifiers can later be trained on a balanced class distribution. We prove that our proposed algorithm is permutation-invariant despite the order of inputs, and experiments on multiple large-scale benchmark text datasets show the superiority of our proposed framework when compared to other SOTA methods.