Improving Matching Models with Hierarchical Contextualized Representations for Multi-turn Response Selection
This work addresses the challenge of improving matching models for multi-turn conversations in chatbots, offering a domain-specific solution that is incremental over existing methods.
The paper tackles the problem of multi-turn response selection in retrieval-based chatbots by proposing hierarchical contextualized representations, which blend word-level and sentence-level features from pre-training on large-scale conversations. Experimental results show significant and consistent improvements on two benchmark datasets.
In this paper, we study context-response matching with pre-trained contextualized representations for multi-turn response selection in retrieval-based chatbots. Existing models, such as Cove and ELMo, are trained with limited context (often a single sentence or paragraph), and may not work well on multi-turn conversations, due to the hierarchical nature, informal language, and domain-specific words. To address the challenges, we propose pre-training hierarchical contextualized representations, including contextual word-level and sentence-level representations, by learning a dialogue generation model from large-scale conversations with a hierarchical encoder-decoder architecture. Then the two levels of representations are blended into the input and output layer of a matching model respectively. Experimental results on two benchmark conversation datasets indicate that the proposed hierarchical contextualized representations can bring significantly and consistently improvement to existing matching models for response selection.