SEMay 19, 2021

Dialogue Disentanglement in Software Engineering: How Far are We?

arXiv:2105.08887v11 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of organizing messy software engineering chat conversations for developers and researchers, but is incremental as it primarily evaluates existing methods and proposes a new evaluation metric.

The authors evaluated five existing dialogue disentanglement approaches on software engineering chat data and found they perform poorly on technical conversations, while also showing that current evaluation metrics don't align with human satisfaction. They introduced a new metric called DLD and identified four common failure patterns in software chat disentanglement.

Despite the valuable information contained in software chat messages, disentangling them into distinct conversations is an essential prerequisite for any in-depth analyses that utilize this information. To provide a better understanding of the current state-of-the-art, we evaluate five popular dialog disentanglement approaches on software-related chat. We find that existing approaches do not perform well on disentangling software-related dialogs that discuss technical and complex topics. Further investigation on how well the existing disentanglement measures reflect human satisfaction shows that existing measures cannot correctly indicate human satisfaction on disentanglement results. Therefore, in this paper, we introduce and evaluate a novel measure, named DLD. Using results of human satisfaction, we further summarize four most frequently appeared bad disentanglement cases on software-related chat to insight future improvements. These cases include (i) ignoring interaction patterns; (ii) ignoring contextual information; (iii) mixing up topics; and (iv) ignoring user relationships. We believe that our findings provide valuable insights on the effectiveness of existing dialog disentanglement approaches and these findings would promote a better application of dialog disentanglement in software engineering.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes