Consistent Classification with Generalized Metrics
This work addresses theoretical foundations for classification metrics in multiclass/multioutput settings, which is incremental but important for researchers in machine learning evaluation.
The authors tackled the problem of constructing and analyzing multiclass and multioutput classification metrics, revealing insights into confusion tensor geometry and equivalence conditions for optimizing arbitrary non-decomposable metrics. They showed that their plug-in estimator is consistent and easily implemented, with empirical results supporting theoretical findings.
We propose a framework for constructing and analyzing multiclass and multioutput classification metrics, i.e., involving multiple, possibly correlated multiclass labels. Our analysis reveals novel insights on the geometry of feasible confusion tensors -- including necessary and sufficient conditions for the equivalence between optimizing an arbitrary non-decomposable metric and learning a weighted classifier. Further, we analyze averaging methodologies commonly used to compute multioutput metrics and characterize the corresponding Bayes optimal classifiers. We show that the plug-in estimator based on this characterization is consistent and is easily implemented as a post-processing rule. Empirical results on synthetic and benchmark datasets support the theoretical findings.