An extended asymmetric sigmoid with Perceptron (SIGTRON) for imbalanced linear classification
This addresses classification challenges in imbalanced datasets, offering a more adaptive approach than existing cost-sensitive methods, though it is incremental as it builds on sigmoid and convex loss frameworks.
The paper tackles imbalanced classification by proposing SIGTRON, an extended asymmetric sigmoid with Perceptron, and the SIGTRON-imbalanced classification (SIC) model, which adapts better to variations in class-imbalance ratios between training and test datasets compared to conventional methods, achieving comparable or superior test accuracy on 51 two-class and 67 multi-class datasets.
This article presents a new polynomial parameterized sigmoid called SIGTRON, which is an extended asymmetric sigmoid with Perceptron, and its companion convex model called SIGTRON-imbalanced classification (SIC) model that employs a virtual SIGTRON-induced convex loss function. In contrast to the conventional $π$-weighted cost-sensitive learning model, the SIC model does not have an external $π$-weight on the loss function but has internal parameters in the virtual SIGTRON-induced loss function. As a consequence, when the given training dataset is close to the well-balanced condition considering the (scale-)class-imbalance ratio, we show that the proposed SIC model is more adaptive to variations of the dataset, such as the inconsistency of the (scale-)class-imbalance ratio between the training and test datasets. This adaptation is justified by a skewed hyperplane equation, created via linearization of the gradient satisfying $ε$-optimal condition. Additionally, we present a quasi-Newton optimization(L-BFGS) framework for the virtual convex loss by developing an interval-based bisection line search. Empirically, we have observed that the proposed approach outperforms (or is comparable to) $π$-weighted convex focal loss and balanced classifier LIBLINEAR(logistic regression, SVM, and L2SVM) in terms of test classification accuracy with $51$ two-class and $67$ multi-class datasets. In binary classification problems, where the scale-class-imbalance ratio of the training dataset is not significant but the inconsistency exists, a group of SIC models with the best test accuracy for each dataset (TOP$1$) outperforms LIBSVM(C-SVC with RBF kernel), a well-known kernel-based classifier.