DialBERT: A Hierarchical Pre-Trained Model for Conversation Disentanglement
This addresses the challenge for systems needing to separate mixed conversations, but it is incremental as it builds on existing BERT and BiLSTM methods.
The paper tackles the problem of conversation disentanglement, where multiple conversations occur in the same channel, by proposing DialBERT, a hierarchical pre-trained model that integrates local and global semantics; it achieves a 12% improvement in F1-Score over BERT with only a 3% parameter increase and sets a state-of-the-art result on a new IBM dataset.
Disentanglement is a problem in which multiple conversations occur in the same channel simultaneously, and the listener should decide which utterance is part of the conversation he will respond to. We propose a new model, named Dialogue BERT (DialBERT), which integrates local and global semantics in a single stream of messages to disentangle the conversations that mixed together. We employ BERT to capture the matching information in each utterance pair at the utterance-level, and use a BiLSTM to aggregate and incorporate the context-level information. With only a 3% increase in parameters, a 12% improvement has been attained in comparison to BERT, based on the F1-Score. The model achieves a state-of-the-art result on the a new dataset proposed by IBM and surpasses previous work by a substantial margin.