CLJun 12, 2021

Improving Unsupervised Dialogue Topic Segmentation with Utterance-Pair Coherence Scoring

arXiv:2106.06719v1702 citations
Originality Incremental advance
AI Analysis

This work addresses a limitation in dialogue modeling for applications like conversational AI, though it is incremental as it builds on existing unsupervised approaches.

The paper tackled the problem of unsupervised dialogue topic segmentation by introducing a method that uses utterance-pair coherence scoring to measure topical relevance between utterances, resulting in outperforming state-of-the-art baselines on three public datasets in English and Chinese.

Dialogue topic segmentation is critical in several dialogue modeling problems. However, popular unsupervised approaches only exploit surface features in assessing topical coherence among utterances. In this work, we address this limitation by leveraging supervisory signals from the utterance-pair coherence scoring task. First, we present a simple yet effective strategy to generate a training corpus for utterance-pair coherence scoring. Then, we train a BERT-based neural utterance-pair coherence model with the obtained training corpus. Finally, such model is used to measure the topical relevance between utterances, acting as the basis of the segmentation inference. Experiments on three public datasets in English and Chinese demonstrate that our proposal outperforms the state-of-the-art baselines.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes