LG CLJul 5, 2019

Deep Reinforcement Learning For Modeling Chit-Chat Dialog With Discrete Attributes

arXiv:1907.02848v250.41013 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of generating engaging and varied chit-chat dialog for users, though it appears incremental by building on existing RL methods with a focus on attribute-based optimization.

The paper tackled the problem of repetitive and generic responses in open-domain dialog systems by conditioning response generation on interpretable discrete dialog attributes, which improved model perplexity and produced diverse, non-redundant responses.

Open domain dialog systems face the challenge of being repetitive and producing generic responses. In this paper, we demonstrate that by conditioning the response generation on interpretable discrete dialog attributes and composed attributes, it helps improve the model perplexity and results in diverse and interesting non-redundant responses. We propose to formulate the dialog attribute prediction as a reinforcement learning (RL) problem and use policy gradients methods to optimize utterance generation using long-term rewards. Unlike existing RL approaches which formulate the token prediction as a policy, our method reduces the complexity of the policy optimization by limiting the action space to dialog attributes, thereby making the policy optimization more practical and sample efficient. We demonstrate this with experimental and human evaluations.

View on arXiv PDF

Similar