Classification Performance Metric for Imbalance Data Based on Recall and Selectivity Normalized in Class Labels
This addresses the challenge of model selection and comparison for researchers and practitioners working with skewed data, though it is incremental as it builds on existing metrics.
The authors tackled the problem of selecting a universal performance metric for imbalanced classification datasets by introducing a new measure based on the harmonic mean of Recall and Selectivity normalized in class labels. They showed that this measure is less sensitive to changes in the majority class and more sensitive to changes in the minority class compared to existing single-value metrics, with analytical proofs provided for its properties.
In the classification of a class imbalance dataset, the performance measure used for the model selection and comparison to competing methods is a major issue. In order to overcome this problem several performance measures are defined and analyzed in several perspectives regarding in particular the imbalance ratio. There is still no clear indication which metric is universal and can be used for any skewed data problem. In this paper we introduced a new performance measure based on the harmonic mean of Recall and Selectivity normalized in class labels. This paper shows that the proposed performance measure has the right properties for the imbalanced dataset. In particular, in the space defined by the majority class examples and imbalance ratio it is less sensitive to changes in the majority class and more sensitive to changes in the minority class compared with other existing single-value performance measures. Additionally, the identity of the other performance measures has been proven analytically.