Effective Incorporation of Speaker Information in Utterance Encoding in Dialog
This work addresses a specific issue in dialog processing for NLP researchers, offering an incremental improvement over conventional methods.
The paper tackled the problem of inconsistent speaker annotations in dialog encoding by proposing a relative speaker modeling method, which achieved superior and more consistent performances in dialog act recognition and response generation.
In dialog studies, we often encode a dialog using a hierarchical encoder where each utterance is converted into an utterance vector, and then a sequence of utterance vectors is converted into a dialog vector. Since knowing who produced which utterance is essential to understanding a dialog, conventional methods tried integrating speaker labels into utterance vectors. We found the method problematic in some cases where speaker annotations are inconsistent among different dialogs. A relative speaker modeling method is proposed to address the problem. Experimental evaluations on dialog act recognition and response generation show that the proposed method yields superior and more consistent performances.