LGApr 12, 2017

Deep Extreme Multi-label Learning

arXiv:1704.03718v4132 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of large-scale classification for applications like Wikipedia tagging, though it is incremental as it builds on existing XML methods.

The paper tackles extreme multi-label learning with millions of labels by proposing a deep embedding method that combines non-linear embedding and graph priors, achieving competitive results against state-of-the-art methods on public datasets.

Extreme multi-label learning (XML) or classification has been a practical and important problem since the boom of big data. The main challenge lies in the exponential label space which involves $2^L$ possible label sets especially when the label dimension $L$ is huge, e.g., in millions for Wikipedia labels. This paper is motivated to better explore the label space by originally establishing an explicit label graph. In the meanwhile, deep learning has been widely studied and used in various classification problems including multi-label classification, however it has not been properly introduced to XML, where the label space can be as large as in millions. In this paper, we propose a practical deep embedding method for extreme multi-label classification, which harvests the ideas of non-linear embedding and graph priors-based label space modeling simultaneously. Extensive experiments on public datasets for XML show that our method performs competitive against state-of-the-art result.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes