Grounding in social media: An approach to building a chit-chat dialogue model
This work addresses the challenge of building more human-like chit-chat dialogue models for conversational AI applications, though it appears incremental by extending existing knowledge-grounded methods to social media data.
The paper tackles the problem of open-domain dialogue systems producing repetitive or generic responses by proposing a method that uses social media comments as external knowledge to improve conversational ability. The approach demonstrates effectiveness through automatic and human evaluations on dialogue datasets.
Building open-domain dialogue systems capable of rich human-like conversational ability is one of the fundamental challenges in language generation. However, even with recent advancements in the field, existing open-domain generative models fail to capture and utilize external knowledge, leading to repetitive or generic responses to unseen utterances. Current work on knowledge-grounded dialogue generation primarily focuses on persona incorporation or searching a fact-based structured knowledge source such as Wikipedia. Our method takes a broader and simpler approach, which aims to improve the raw conversation ability of the system by mimicking the human response behavior through casual interactions found on social media. Utilizing a joint retriever-generator setup, the model queries a large set of filtered comment data from Reddit to act as additional context for the seq2seq generator. Automatic and human evaluations on open-domain dialogue datasets demonstrate the effectiveness of our approach.