CL AIOct 27, 2022

Dial2vec: Self-Guided Contrastive Learning of Unsupervised Dialogue Embeddings

Che Liu, Rui Wang, Junfeng Jiang, Yongbin Li, Fei Huang

arXiv:2210.15332v124.0294 citationsh-index: 48Has Code

Originality Highly original

AI Analysis

This work addresses the challenge of poor performance in dialogue embeddings for researchers and practitioners in conversational AI by introducing a novel method that leverages interaction patterns, though it is incremental in improving existing approaches.

The paper tackles the problem of learning unsupervised dialogue embeddings by proposing dial2vec, a self-guided contrastive learning method that captures conversational interactions between interlocutors, achieving average improvements of 8.7, 9.0, and 13.8 points over baselines on domain categorization, semantic relatedness, and dialogue retrieval tasks.

In this paper, we introduce the task of learning unsupervised dialogue embeddings. Trivial approaches such as combining pre-trained word or sentence embeddings and encoding through pre-trained language models (PLMs) have been shown to be feasible for this task. However, these approaches typically ignore the conversational interactions between interlocutors, resulting in poor performance. To address this issue, we proposed a self-guided contrastive learning approach named dial2vec. Dial2vec considers a dialogue as an information exchange process. It captures the conversational interaction patterns between interlocutors and leverages them to guide the learning of the embeddings corresponding to each interlocutor. The dialogue embedding is obtained by an aggregation of the embeddings from all interlocutors. To verify our approach, we establish a comprehensive benchmark consisting of six widely-used dialogue datasets. We consider three evaluation tasks: domain categorization, semantic relatedness, and dialogue retrieval. Dial2vec achieves on average 8.7, 9.0, and 13.8 points absolute improvements in terms of purity, Spearman's correlation, and mean average precision (MAP) over the strongest baseline on the three tasks respectively. Further analysis shows that dial2vec obtains informative and discriminative embeddings for both interlocutors under the guidance of the conversational interactions and achieves the best performance when aggregating them through the interlocutor-level pooling strategy. All codes and data are publicly available at https://github.com/AlibabaResearch/DAMO-ConvAI/tree/main/dial2vec.

View on arXiv PDF Code

Similar