CLApr 21, 2020

DIET: Lightweight Language Understanding for Dialogue Systems

arXiv:2004.09936v3184 citations
AI Analysis

This work addresses the need for efficient dialogue systems by providing a lightweight solution that reduces training time and computational cost, though it is incremental as it builds on existing transformer and pre-training methods.

The paper tackles the problem of lightweight language understanding for dialogue systems by introducing the Dual Intent and Entity Transformer (DIET) architecture, which advances state-of-the-art on a complex multi-domain NLU dataset and achieves similar high performance on simpler datasets, with the best model outperforming fine-tuning BERT and being about six times faster to train.

Large-scale pre-trained language models have shown impressive results on language understanding benchmarks like GLUE and SuperGLUE, improving considerably over other pre-training methods like distributed representations (GloVe) and purely supervised approaches. We introduce the Dual Intent and Entity Transformer (DIET) architecture, and study the effectiveness of different pre-trained representations on intent and entity prediction, two common dialogue language understanding tasks. DIET advances the state of the art on a complex multi-domain NLU dataset and achieves similarly high performance on other simpler datasets. Surprisingly, we show that there is no clear benefit to using large pre-trained models for this task, and in fact DIET improves upon the current state of the art even in a purely supervised setup without any pre-trained embeddings. Our best performing model outperforms fine-tuning BERT and is about six times faster to train.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes