LG CVMay 3, 2021

Weighted Least Squares Twin Support Vector Machine with Fuzzy Rough Set Theory for Imbalanced Data Classification

Maysam Behmanesh, Peyman Adibi, Hossein Karshenas

arXiv:2105.01198v21.64 citations

Originality Incremental advance

AI Analysis

This work addresses classification challenges for imbalanced data, which is a common issue in domains like medical diagnosis or fraud detection, but it is incremental as it builds on existing SVM and rough set techniques.

The authors tackled the problem of poor SVM performance on imbalanced data by proposing FRLSTSVM, which integrates fuzzy rough set theory with weighted least squares twin SVM, showing superior results compared to traditional SVM-based methods on imbalanced datasets.

Support vector machines (SVMs) are powerful supervised learning tools developed to solve classification problems. However, SVMs are likely to perform poorly in the classification of imbalanced data. The rough set theory presents a mathematical tool for inference in nondeterministic cases that provides methods for removing irrelevant information from data. In this work, we propose an approach that efficiently used fuzzy rough set theory in weighted least squares twin support vector machine called FRLSTSVM for classification of imbalanced data. The first innovation is introducing a new fuzzy rough set-based under-sampling strategy to make the classifier robust in terms of the imbalanced data. For constructing the two proximal hyperplanes in FRLSTSVM, data points from the minority class remain unchanged while a subset of data points in the majority class are selected using a new method. In this model, we embed the weight biases in the LSTSVM formulations to overcome the bias phenomenon in the original twin SVM for the classification of imbalanced data. In order to determine these weights in this formulation, we introduce a new strategy that uses fuzzy rough set theory as the second innovation. Experimental results on the famous imbalanced datasets, compared to the related traditional SVM-based methods, demonstrate the superiority of the proposed FRLSTSVM model in the imbalanced data classification.

View on arXiv PDF

Similar