MLLGJun 17, 2020

The MCC-F1 curve: a performance evaluation technique for binary classification

arXiv:2006.11278v160 citations
Originality Incremental advance
AI Analysis

This addresses evaluation issues for researchers and practitioners in fields using binary classification, but it is incremental as it builds on existing metrics.

The paper tackles the problem of misleading performance evaluations from ROC and PR curves in binary classification, especially with imbalanced data, by proposing the MCC-F1 curve, which more clearly differentiates classifiers and includes a single integrated metric.

Many fields use the ROC curve and the PR curve as standard evaluations of binary classification methods. Analysis of ROC and PR, however, often gives misleading and inflated performance evaluations, especially with an imbalanced ground truth. Here, we demonstrate the problems with ROC and PR analysis through simulations, and propose the MCC-F1 curve to address these drawbacks. The MCC-F1 curve combines two informative single-threshold metrics, MCC and the F1 score. The MCC-F1 curve more clearly differentiates good and bad classifiers, even with imbalanced ground truths. We also introduce the MCC-F1 metric, which provides a single value that integrates many aspects of classifier performance across the whole range of classification thresholds. Finally, we provide an R package that plots MCC-F1 curves and calculates related metrics.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes