CL AI LGMay 27, 2021

ProtAugment: Unsupervised diverse short-texts paraphrasing for intent detection meta-learning

Thomas Dopierre, Christophe Gravier, Wilfried Logerais

arXiv:2105.12995v130.6651 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the problem of data scarcity in intent detection for NLP applications, though it is incremental as it builds on existing Prototypical Networks.

The paper tackles few-shot intent detection by proposing ProtAugment, a meta-learning algorithm that uses unsupervised diverse paraphrasing to limit overfitting, achieving state-of-the-art results without extra labeling or domain-specific fine-tuning.

Recent research considers few-shot intent detection as a meta-learning problem: the model is learning to learn from a consecutive set of small tasks named episodes. In this work, we propose ProtAugment, a meta-learning algorithm for short texts classification (the intent detection task). ProtAugment is a novel extension of Prototypical Networks, that limits overfitting on the bias introduced by the few-shots classification objective at each episode. It relies on diverse paraphrasing: a conditional language model is first fine-tuned for paraphrasing, and diversity is later introduced at the decoding stage at each meta-learning episode. The diverse paraphrasing is unsupervised as it is applied to unlabelled data, and then fueled to the Prototypical Network training objective as a consistency loss. ProtAugment is the state-of-the-art method for intent detection meta-learning, at no extra labeling efforts and without the need to fine-tune a conditional language model on a given application domain.

View on arXiv PDF Code

Similar