Interactive Evaluation of Dialog Track at DSTC9
This work addresses the need for more thorough evaluation of open-domain dialog systems for researchers and developers, but it is incremental as it builds on existing dialog challenges by adding interactive components.
The paper introduced the Interactive Evaluation of Dialog Track at DSTC9, which tackled the problem of moving dialog systems from static datasets to interactive settings with real users, resulting in a track that challenged participants to build knowledge-grounded response generation models and assess them in back-and-forth interactions.
The ultimate goal of dialog research is to develop systems that can be effectively used in interactive settings by real users. To this end, we introduced the Interactive Evaluation of Dialog Track at the 9th Dialog System Technology Challenge. This track consisted of two sub-tasks. The first sub-task involved building knowledge-grounded response generation models. The second sub-task aimed to extend dialog models beyond static datasets by assessing them in an interactive setting with real users. Our track challenges participants to develop strong response generation models and explore strategies that extend them to back-and-forth interactions with real users. The progression from static corpora to interactive evaluation introduces unique challenges and facilitates a more thorough assessment of open-domain dialog systems. This paper provides an overview of the track, including the methodology and results. Furthermore, it provides insights into how to best evaluate open-domain dialog models