MECVMLFeb 12, 2012

A better Beta for the H measure of classification performance

arXiv:1202.2564v274 citations
AI Analysis

This work provides an incremental improvement for researchers dealing with classification performance evaluation, particularly in unbalanced data scenarios.

The paper addresses the incoherence of the area under the ROC curve by proposing a modified standard distribution for the H measure, specifically the Beta(π₁+1,π₀+1) distribution, to better handle heavily unbalanced datasets.

The area under the ROC curve is widely used as a measure of performance of classification rules. However, it has recently been shown that the measure is fundamentally incoherent, in the sense that it treats the relative severities of misclassifications differently when different classifiers are used. To overcome this, Hand (2009) proposed the $H$ measure, which allows a given researcher to fix the distribution of relative severities to a classifier-independent setting on a given problem. This note extends the discussion, and proposes a modified standard distribution for the $H$ measure, which better matches the requirements of researchers, in particular those faced with heavily unbalanced datasets, the $Beta(π_1+1,π_0+1)$ distribution. [Preprint submitted at Pattern Recognition Letters]

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes