LGMLNov 16, 2020

Multi-label classification: do Hamming loss and subset accuracy really conflict with each other?

arXiv:2011.07805v141 citations
AI Analysis

This work addresses a theoretical inconsistency for researchers in multi-label classification, providing insights into measure optimization but is incremental as it builds on existing theories.

The paper tackles the gap between empirical results and theoretical analysis in multi-label classification by analyzing learning guarantees for Hamming Loss (HL) and Subset Accuracy (SA) measures, showing that optimizing HL can yield competitive SA performance in small label spaces with bounds depending on O(c) for SA and independent of c for HL.

Various evaluation measures have been developed for multi-label classification, including Hamming Loss (HL), Subset Accuracy (SA) and Ranking Loss (RL). However, there is a gap between empirical results and the existing theories: 1) an algorithm often empirically performs well on some measure(s) while poorly on others, while a formal theoretical analysis is lacking; and 2) in small label space cases, the algorithms optimizing HL often have comparable or even better performance on the SA measure than those optimizing SA directly, while existing theoretical results show that SA and HL are conflicting measures. This paper provides an attempt to fill up this gap by analyzing the learning guarantees of the corresponding learning algorithms on both SA and HL measures. We show that when a learning algorithm optimizes HL with its surrogate loss, it enjoys an error bound for the HL measure independent of $c$ (the number of labels), while the bound for the SA measure depends on at most $O(c)$. On the other hand, when directly optimizing SA with its surrogate loss, it has learning guarantees that depend on $O(\sqrt{c})$ for both HL and SA measures. This explains the observation that when the label space is not large, optimizing HL with its surrogate loss can have promising performance for SA. We further show that our techniques are applicable to analyze the learning guarantees of algorithms on other measures, such as RL. Finally, the theoretical analyses are supported by experimental results.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes