Speaker Turn Modeling for Dialogue Act Classification
This work addresses the limitation of existing methods that treat dialogue as non-interactive text, offering an incremental improvement for natural language processing tasks involving conversational data.
The paper tackled the problem of Dialogue Act classification by integrating speaker turn changes into utterance modeling, resulting in superior performance on three benchmark datasets.
Dialogue Act (DA) classification is the task of classifying utterances with respect to the function they serve in a dialogue. Existing approaches to DA classification model utterances without incorporating the turn changes among speakers throughout the dialogue, therefore treating it no different than non-interactive written text. In this paper, we propose to integrate the turn changes in conversations among speakers when modeling DAs. Specifically, we learn conversation-invariant speaker turn embeddings to represent the speaker turns in a conversation; the learned speaker turn embeddings are then merged with the utterance embeddings for the downstream task of DA classification. With this simple yet effective mechanism, our model is able to capture the semantics from the dialogue content while accounting for different speaker turns in a conversation. Validation on three benchmark public datasets demonstrates superior performance of our model.