Content Word-based Sentence Decoding and Evaluating for Open-domain Neural Response Generation
This addresses content relevance in open-domain neural response generation, but it is incremental as it builds on existing encoder-decoder models with a new intermediate representation.
The paper tackles the problem of generating more content-relevant responses in open-domain dialog by using content word sequences as an intermediate representation, inspired by Broca's aphasia, and shows improved content relatedness and grammatical correctness in experiments.
Various encoder-decoder models have been applied to response generation in open-domain dialogs, but a majority of conventional models directly learn a mapping from lexical input to lexical output without explicitly modeling intermediate representations. Utilizing language hierarchy and modeling intermediate information have been shown to benefit many language understanding and generation tasks. Motivated by Broca's aphasia, we propose to use a content word sequence as an intermediate representation for open-domain response generation. Experimental results show that the proposed method improves content relatedness of produced responses, and our models can often choose correct grammar for generated content words. Meanwhile, instead of evaluating complete sentences, we propose to compute conventional metrics on content word sequences, which is a better indicator of content relevance.