Evaluator for Emotionally Consistent Chatbots
This addresses a gap in chatbot evaluation for developers and researchers, though it appears incremental as it builds on existing evaluation frameworks.
The paper tackles the problem of evaluating emotionally consistent chatbots by proposing a trained evaluator, as current methods focus on aspects like coherence and fluency but lack emotional consistency assessment.
One challenge for evaluating current sequence- or dialogue-level chatbots, such as Empathetic Open-domain Conversation Models, is to determine whether the chatbot performs in an emotionally consistent way. The most recent work only evaluates on the aspects of context coherence, language fluency, response diversity, or logical self-consistency between dialogues. This work proposes training an evaluator to determine the emotional consistency of chatbots.