LG CV PFDec 5, 2024

Foundations of the Theory of Performance-Based Ranking

Sébastien Piérard, Anaïs Halin, Anthony Cioppa, Adrien Deliège, Marc Van Droogenbroeck

arXiv:2412.04227v310.45 citationsh-index: 20CVPR

Originality Incremental advance

AI Analysis

This work provides a foundational theory for performance-based ranking, which is incremental in formalizing and extending existing ranking concepts in fields like machine learning and statistics.

The paper tackles the challenge of ranking entities based on performance while incorporating application-specific preferences by establishing a universal theoretical framework and axiomatic definitions for performance orderings and rankings. It introduces a parametric family of ranking scores that generalizes common metrics like accuracy and F1, but identifies that some existing scores fail to meet the proposed axioms.

Ranking entities such as algorithms, devices, methods, or models based on their performances, while accounting for application-specific preferences, is a challenge. To address this challenge, we establish the foundations of a universal theory for performance-based ranking. First, we introduce a rigorous framework built on top of both the probability and order theories. Our new framework encompasses the elements necessary to (1) manipulate performances as mathematical objects, (2) express which performances are worse than or equivalent to others, (3) model tasks through a variable called satisfaction, (4) consider properties of the evaluation, (5) define scores, and (6) specify application-specific preferences through a variable called importance. On top of this framework, we propose the first axiomatic definition of performance orderings and performance-based rankings. Then, we introduce a universal parametric family of scores, called ranking scores, that can be used to establish rankings satisfying our axioms, while considering application-specific preferences. Finally, we show, in the case of two-class classification, that the family of ranking scores encompasses well-known performance scores, including the accuracy, the true positive rate (recall, sensitivity), the true negative rate (specificity), the positive predictive value (precision), and F1. However, we also show that some other scores commonly used to compare classifiers are unsuitable to derive performance orderings satisfying the axioms.

View on arXiv PDF

Similar