LGMLApr 1, 2020

Extreme Multi-label Classification from Aggregated Labels

arXiv:2004.00198v110 citations
Originality Incremental advance
AI Analysis

This addresses a practical limitation in XMC for applications where aggregated labels are common, such as in large-scale tagging or recommendation systems, though it is incremental as it builds on existing XMC methods.

The paper tackles the problem of extreme multi-label classification (XMC) when labels are only available for groups of samples, not individual ones, by developing a scalable algorithm to impute individual labels from group labels, which can be paired with existing XMC methods, and experiments show advantages over existing approaches.

Extreme multi-label classification (XMC) is the problem of finding the relevant labels for an input, from a very large universe of possible labels. We consider XMC in the setting where labels are available only for groups of samples - but not for individual ones. Current XMC approaches are not built for such multi-instance multi-label (MIML) training data, and MIML approaches do not scale to XMC sizes. We develop a new and scalable algorithm to impute individual-sample labels from the group labels; this can be paired with any existing XMC method to solve the aggregated label problem. We characterize the statistical properties of our algorithm under mild assumptions, and provide a new end-to-end framework for MIML as an extension. Experiments on both aggregated label XMC and MIML tasks show the advantages over existing approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes