LGFeb 6, 2017

Optimizing Cost-Sensitive SVM for Imbalanced Data :Connecting Cluster to Classification

arXiv:1702.01504v19 citations

Originality Incremental advance

AI Analysis

This work addresses class imbalance issues in applications like accident monitoring, offering an incremental improvement over existing cost-sensitive techniques.

The paper tackles the problem of class imbalance in machine learning by proposing a novel cost-sensitive SVM method that optimizes penalty parameters using cluster probability density functions, showing effectiveness on benchmark and real-world datasets with varying imbalance ratios.

Class imbalance is one of the challenging problems for machine learning in many real-world applications, such as coal and gas burst accident monitoring: the burst premonition data is extreme smaller than the normal data, however, which is the highlight we truly focus on. Cost-sensitive adjustment approach is a typical algorithm-level method resisting the data set imbalance. For SVMs classifier, which is modified to incorporate varying penalty parameter(C) for each of considered groups of examples. However, the C value is determined empirically, or is calculated according to the evaluation metric, which need to be computed iteratively and time consuming. This paper presents a novel cost-sensitive SVM method whose penalty parameter C optimized on the basis of cluster probability density function(PDF) and the cluster PDF is estimated only according to similarity matrix and some predefined hyper-parameters. Experimental results on various standard benchmark data sets and real-world data with different ratios of imbalance show that the proposed method is effective in comparison with commonly used cost-sensitive techniques.

View on arXiv PDF

Similar