Hierarchical Text Generation and Planning for Strategic Dialogue
This addresses the problem of entangled linguistic and strategic aspects in dialogue systems for researchers and practitioners, though it is incremental as it builds on existing latent-variable models.
The paper tackles the challenge of training end-to-end models for goal-oriented dialogue by introducing an approach that decouples dialogue semantics from linguistic realization through latent sentence representations, which increases end-task reward by 15% and improves planning effectiveness without diverging from human language.
End-to-end models for goal-orientated dialogue are challenging to train, because linguistic and strategic aspects are entangled in latent state vectors. We introduce an approach to learning representations of messages in dialogues by maximizing the likelihood of subsequent sentences and actions, which decouples the semantics of the dialogue utterance from its linguistic realization. We then use these latent sentence representations for hierarchical language generation, planning and reinforcement learning. Experiments show that our approach increases the end-task reward achieved by the model, improves the effectiveness of long-term planning using rollouts, and allows self-play reinforcement learning to improve decision making without diverging from human language. Our hierarchical latent-variable model outperforms previous work both linguistically and strategically.