CLLGAug 14, 2020

Language Models as Few-Shot Learner for Task-Oriented Dialogue Systems

arXiv:2008.06239v263 citations
Originality Synthesis-oriented
AI Analysis

This addresses the challenge of reducing data collection costs for dialogue systems, but it is incremental as it evaluates existing methods rather than introducing new ones.

The paper tackles the problem of few-shot learning for task-oriented dialogue systems by evaluating the priming ability of language models like GPT-2 and GPT-3 across NLU, DST, DP, and NLG tasks, highlighting current limitations and discussing future implications without providing concrete numerical results.

Task-oriented dialogue systems use four connected modules, namely, Natural Language Understanding (NLU), a Dialogue State Tracking (DST), Dialogue Policy (DP) and Natural Language Generation (NLG). A research challenge is to learn each module with the least amount of samples (i.e., few-shots) given the high cost related to the data collection. The most common and effective technique to solve this problem is transfer learning, where large language models, either pre-trained on text or task-specific data, are fine-tuned on the few samples. These methods require fine-tuning steps and a set of parameters for each task. Differently, language models, such as GPT-2 (Radford et al., 2019) and GPT-3 (Brown et al., 2020), allow few-shot learning by priming the model with few examples. In this paper, we evaluate the priming few-shot ability of language models in the NLU, DST, DP and NLG tasks. Importantly, we highlight the current limitations of this approach, and we discuss the possible implication for future work.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes