LGAIIRMLApr 15, 2025

Bipartite Ranking From Multiple Labels: On Loss Versus Label Aggregation

arXiv:2504.11284v22 citationsh-index: 42ICML
Originality Incremental advance
AI Analysis

This addresses a fundamental issue in supervised learning for ranking tasks, particularly in scenarios with multiple annotators, but it is incremental as it builds on existing bipartite ranking frameworks.

The paper tackles the problem of bipartite ranking when multiple binary labels are available, by analyzing loss aggregation versus label aggregation approaches and showing that loss aggregation can lead to label dictatorship, while label aggregation is preferable, as empirically verified.

Bipartite ranking is a fundamental supervised learning problem, with the goal of learning a ranking over instances with maximal Area Under the ROC Curve (AUC) against a single binary target label. However, one may often observe multiple binary target labels, e.g., from distinct human annotators. How can one synthesize such labels into a single coherent ranking? In this work, we formally analyze two approaches to this problem -- loss aggregation and label aggregation -- by characterizing their Bayes-optimal solutions. We show that while both approaches can yield Pareto-optimal solutions, loss aggregation can exhibit label dictatorship: one can inadvertently (and undesirably) favor one label over others. This suggests that label aggregation can be preferable to loss aggregation, which we empirically verify.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes