SideControl: Controlled Open-domain Dialogue Generation via Additive Side Networks
This work addresses the challenge of efficiently controlling dialogue generation for applications like chatbots, though it appears incremental as it builds on prior Transformer-based methods.
The paper tackles the problem of controlling attribute-specific text generation in open-domain dialogue systems by proposing the SideControl framework, which achieves better controllability, higher generation quality, and improved sample-efficiency compared to existing gradient-based and weighted-decoding methods on two benchmark datasets.
Transformer-based pre-trained language models boost the performance of open-domain dialogue systems. Prior works leverage Transformer-based pre-trained language models to generate texts with desired attributes in two general approaches: (1) gradient-based methods: updating all latent representations of pre-trained models with gradients from attribute models; (2) weighted-decoding methods: re-ranking beam candidates from pre-trained models with attribute functions. However, gradient-based methods lead to high computation cost and can easily get overfitted on small training sets, while weighted-decoding methods are inherently constrained by the low-variance high-bias pre-trained model. In this work, we propose a novel approach to control the generation of Transformer-based pre-trained language models: the SideControl framework, which leverages a novel control attributes loss to incorporate useful control signals, and is shown to perform well with very limited training samples. We evaluate our proposed method on two benchmark open-domain dialogue datasets, and results show that the SideControl framework has better controllability, higher generation quality and better sample-efficiency than existing gradient-based and weighted-decoding baselines.