CLAISep 22, 2021

DialogueBERT: A Self-Supervised Learning based Dialogue Pre-training Encoder

arXiv:2109.10480v127 citations
Originality Incremental advance
AI Analysis

This addresses the problem of improving conversational bots for E-commerce customer service by enhancing dialogue understanding, though it is incremental as it builds on existing BERT methods.

The authors tackled the challenge of understanding dialogues, which involve hierarchical structures and alternating roles, by proposing DialogueBERT, a self-supervised pre-trained encoder based on BERT, achieving 88.63% accuracy for intent recognition, 94.25% for emotion recognition, and 97.04% F1 score for named entity recognition.

With the rapid development of artificial intelligence, conversational bots have became prevalent in mainstream E-commerce platforms, which can provide convenient customer service timely. To satisfy the user, the conversational bots need to understand the user's intention, detect the user's emotion, and extract the key entities from the conversational utterances. However, understanding dialogues is regarded as a very challenging task. Different from common language understanding, utterances in dialogues appear alternately from different roles and are usually organized as hierarchical structures. To facilitate the understanding of dialogues, in this paper, we propose a novel contextual dialogue encoder (i.e. DialogueBERT) based on the popular pre-trained language model BERT. Five self-supervised learning pre-training tasks are devised for learning the particularity of dialouge utterances. Four different input embeddings are integrated to catch the relationship between utterances, including turn embedding, role embedding, token embedding and position embedding. DialogueBERT was pre-trained with 70 million dialogues in real scenario, and then fine-tuned in three different downstream dialogue understanding tasks. Experimental results show that DialogueBERT achieves exciting results with 88.63% accuracy for intent recognition, 94.25% accuracy for emotion recognition and 97.04% F1 score for named entity recognition, which outperforms several strong baselines by a large margin.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes