AIDec 10, 2020

GNN-XML: Graph Neural Networks for Extreme Multi-label Text Classification

arXiv:2012.05860v112 citations
AI Analysis

This paper tackles the data scalability and sparsity challenges in extreme multi-label text classification for applications like news annotation and product recommendation, offering an incremental improvement over existing methods.

This paper addresses extreme multi-label text classification (XMTC) where texts are tagged from extremely large label sets. The proposed GNN-XML framework significantly outperforms state-of-the-art methods on multiple benchmark datasets while maintaining comparable prediction efficiency and model size.

Extreme multi-label text classification (XMTC) aims to tag a text instance with the most relevant subset of labels from an extremely large label set. XMTC has attracted much recent attention due to massive label sets yielded by modern applications, such as news annotation and product recommendation. The main challenges of XMTC are the data scalability and sparsity, thereby leading to two issues: i) the intractability to scale to the extreme label setting, ii) the presence of long-tailed label distribution, implying that a large fraction of labels have few positive training instances. To overcome these problems, we propose GNN-XML, a scalable graph neural network framework tailored for XMTC problems. Specifically, we exploit label correlations via mining their co-occurrence patterns and build a label graph based on the correlation matrix. We then conduct the attributed graph clustering by performing graph convolution with a low-pass graph filter to jointly model label dependencies and label features, which induces semantic label clusters. We further propose a bilateral-branch graph isomorphism network to decouple representation learning and classifier learning for better modeling tail labels. Experimental results on multiple benchmark datasets show that GNN-XML significantly outperforms state-of-the-art methods while maintaining comparable prediction efficiency and model size.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes