LGJan 29, 2024

Consistent algorithms for multi-label classification with macro-at-$k$ metrics

arXiv:2401.16594v37 citationsh-index: 15ICLR
Originality Incremental advance
AI Analysis

This addresses a challenging optimization problem in multi-label classification for applications with long-tail labels, though it is incremental as it builds on existing frameworks.

The paper tackles the optimization of macro-at-k metrics in multi-label classification, where predictions are constrained to exactly k labels per instance, by proving the existence of an optimal classifier and proposing a statistically consistent algorithm based on the Frank-Wolfe method, with empirical results showing competitive performance.

We consider the optimization of complex performance metrics in multi-label classification under the population utility framework. We mainly focus on metrics linearly decomposable into a sum of binary classification utilities applied separately to each label with an additional requirement of exactly $k$ labels predicted for each instance. These "macro-at-$k$" metrics possess desired properties for extreme classification problems with long tail labels. Unfortunately, the at-$k$ constraint couples the otherwise independent binary classification tasks, leading to a much more challenging optimization problem than standard macro-averages. We provide a statistical framework to study this problem, prove the existence and the form of the optimal classifier, and propose a statistically consistent and practical learning algorithm based on the Frank-Wolfe method. Interestingly, our main results concern even more general metrics being non-linear functions of label-wise confusion matrices. Empirical results provide evidence for the competitive performance of the proposed approach.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes