CLOct 31, 2019

Learning to Customize Model Structures for Few-shot Dialogue Generation Tasks

arXiv:1910.14326v21010 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of building adaptable dialogue systems for developers in few-shot settings, offering an incremental improvement over existing meta-learning methods by focusing on model-structure customization.

The paper tackles the challenge of training generative models for open-domain dialogue systems with minimal data by proposing an algorithm that customizes unique model structures for each few-shot task, outperforming baselines in task consistency, response quality, and diversity.

Training the generative models with minimal corpus is one of the critical challenges for building open-domain dialogue systems. Existing methods tend to use the meta-learning framework which pre-trains the parameters on all non-target tasks then fine-tunes on the target task. However, fine-tuning distinguishes tasks from the parameter perspective but ignores the model-structure perspective, resulting in similar dialogue models for different tasks. In this paper, we propose an algorithm that can customize a unique dialogue model for each task in the few-shot setting. In our approach, each dialogue model consists of a shared module, a gating module, and a private module. The first two modules are shared among all the tasks, while the third one will differentiate into different network structures to better capture the characteristics of the corresponding task. The extensive experiments on two datasets show that our method outperforms all the baselines in terms of task consistency, response quality, and diversity.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes