CLLGApr 11, 2019

Adapting RNN Sequence Prediction Model to Multi-label Set Prediction

arXiv:1904.05829v11091 citations
Originality Highly original
AI Analysis

This work addresses multi-label classification for text, an incremental improvement over existing RNN-based methods by better handling set predictions.

The authors tackled the problem of multi-label text classification by adapting RNN sequence models to predict sets of labels instead of sequences, deriving a principled set probability formulation and new training and prediction objectives. Experiments on benchmark datasets showed that their method outperformed state-of-the-art approaches.

We present an adaptation of RNN sequence models to the problem of multi-label classification for text, where the target is a set of labels, not a sequence. Previous such RNN models define probabilities for sequences but not for sets; attempts to obtain a set probability are after-thoughts of the network design, including pre-specifying the label order, or relating the sequence probability to the set probability in ad hoc ways. Our formulation is derived from a principled notion of set probability, as the sum of probabilities of corresponding permutation sequences for the set. We provide a new training objective that maximizes this set probability, and a new prediction objective that finds the most probable set on a test document. These new objectives are theoretically appealing because they give the RNN model freedom to discover the best label order, which often is the natural one (but different among documents). We develop efficient procedures to tackle the computation difficulties involved in training and prediction. Experiments on benchmark datasets demonstrate that we outperform state-of-the-art methods for this task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes