Methods for Class-Imbalanced Learning with Support Vector Machines: A Review and an Empirical Evaluation
It addresses the problem of class imbalance in machine learning for practitioners using SVMs, but it is incremental as it reviews and compares existing methods rather than introducing new ones.
This paper reviews and empirically evaluates methods for handling class-imbalanced learning with Support Vector Machines (SVMs), categorizing them into re-sampling, algorithmic, and fusion approaches. The findings show that fusion methods generally perform best but are more computationally intensive, while algorithmic methods are less time-consuming.
This paper presents a review on methods for class-imbalanced learning with the Support Vector Machine (SVM) and its variants. We first explain the structure of SVM and its variants and discuss their inefficiency in learning with class-imbalanced data sets. We introduce a hierarchical categorization of SVM-based models with respect to class-imbalanced learning. Specifically, we categorize SVM-based models into re-sampling, algorithmic, and fusion methods, and discuss the principles of the representative models in each category. In addition, we conduct a series of empirical evaluations to compare the performances of various representative SVM-based models in each category using benchmark imbalanced data sets, ranging from low to high imbalanced ratios. Our findings reveal that while algorithmic methods are less time-consuming owing to no data pre-processing requirements, fusion methods, which combine both re-sampling and algorithmic approaches, generally perform the best, but with a higher computational load. A discussion on research gaps and future research directions is provided.