A Unified Approach to Emotion Detection and Task-Oriented Dialogue Modeling
This work addresses the challenge of making dialogue systems more empathetic and context-aware for users, though it is incremental as it builds on existing models like SimpleToD.
The paper tackles the problem of integrating emotion detection into task-oriented dialogue systems, showing that a unified approach improves both emotion detection accuracy and task performance on the EmoWOZ benchmark.
In current text-based task-oriented dialogue (TOD) systems, user emotion detection (ED) is often overlooked or is typically treated as a separate and independent task, requiring additional training. In contrast, our work demonstrates that seamlessly unifying ED and TOD modeling brings about mutual benefits, and is therefore an alternative to be considered. Our method consists in augmenting SimpleToD, an end-to-end TOD system, by extending belief state tracking to include ED, relying on a single language model. We evaluate our approach using GPT-2 and Llama-2 on the EmoWOZ benchmark, a version of MultiWOZ annotated with emotions. Our results reveal a general increase in performance for ED and task results. Our findings also indicate that user emotions provide useful contextual conditioning for system responses, and can be leveraged to further refine responses in terms of empathy.