LG AI IR MLApr 15, 2025

Bipartite Ranking From Multiple Labels: On Loss Versus Label Aggregation

Michal Lukasik, Lin Chen, Harikrishna Narasimhan, Aditya Krishna Menon, Wittawat Jitkrittum, Felix X. Yu, Sashank J. Reddi, Gang Fu, Mohammadhossein Bateni, Sanjiv Kumar

arXiv:2504.11284v29.42 citationsh-index: 42ICML

Originality Incremental advance

AI Analysis

This addresses a fundamental issue in supervised learning for ranking tasks, particularly in scenarios with multiple annotators, but it is incremental as it builds on existing bipartite ranking frameworks.

The paper tackles the problem of bipartite ranking when multiple binary labels are available, by analyzing loss aggregation versus label aggregation approaches and showing that loss aggregation can lead to label dictatorship, while label aggregation is preferable, as empirically verified.

Bipartite ranking is a fundamental supervised learning problem, with the goal of learning a ranking over instances with maximal Area Under the ROC Curve (AUC) against a single binary target label. However, one may often observe multiple binary target labels, e.g., from distinct human annotators. How can one synthesize such labels into a single coherent ranking? In this work, we formally analyze two approaches to this problem -- loss aggregation and label aggregation -- by characterizing their Bayes-optimal solutions. We show that while both approaches can yield Pareto-optimal solutions, loss aggregation can exhibit label dictatorship: one can inadvertently (and undesirably) favor one label over others. This suggests that label aggregation can be preferable to loss aggregation, which we empirically verify.

View on arXiv PDF

Similar