LGAIIRMLMay 7, 2019

Taming Pretrained Transformers for Extreme Multi-label Text Classification

arXiv:1905.02331v468 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of applying deep transformers to XMC tasks with sparse labels, which is important for applications like product categorization, though it appears incremental in method adaptation.

The paper tackles the extreme multi-label text classification (XMC) problem with large label sets by proposing X-Transformer, a scalable fine-tuning approach for pretrained transformers, achieving state-of-the-art results including 77.28% precision@1 on a Wiki dataset with 0.5 million labels and a 10.7% relative improvement on an Amazon dataset.

We consider the extreme multi-label text classification (XMC) problem: given an input text, return the most relevant labels from a large label collection. For example, the input text could be a product description on Amazon.com and the labels could be product categories. XMC is an important yet challenging problem in the NLP community. Recently, deep pretrained transformer models have achieved state-of-the-art performance on many NLP tasks including sentence classification, albeit with small label sets. However, naively applying deep transformer models to the XMC problem leads to sub-optimal performance due to the large output space and the label sparsity issue. In this paper, we propose X-Transformer, the first scalable approach to fine-tuning deep transformer models for the XMC problem. The proposed method achieves new state-of-the-art results on four XMC benchmark datasets. In particular, on a Wiki dataset with around 0.5 million labels, the prec@1 of X-Transformer is 77.28%, a substantial improvement over state-of-the-art XMC approaches Parabel (linear) and AttentionXML (neural), which achieve 68.70% and 76.95% precision@1, respectively. We further apply X-Transformer to a product2query dataset from Amazon and gained 10.7% relative improvement on prec@1 over Parabel.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes