ConvLab-2: An Open-Source Toolkit for Building, Evaluating, and Diagnosing Dialogue Systems
This toolkit addresses the problem for researchers in dialogue systems by providing a comprehensive platform for development and evaluation, though it is incremental as it builds upon the previous ConvLab framework.
The authors tackled the challenge of building and evaluating task-oriented dialogue systems by introducing ConvLab-2, an open-source toolkit that integrates state-of-the-art models, supports multiple datasets, and includes tools for end-to-end evaluation and diagnosis, resulting in enhanced capabilities for researchers to analyze and improve system weaknesses.
We present ConvLab-2, an open-source toolkit that enables researchers to build task-oriented dialogue systems with state-of-the-art models, perform an end-to-end evaluation, and diagnose the weakness of systems. As the successor of ConvLab (Lee et al., 2019b), ConvLab-2 inherits ConvLab's framework but integrates more powerful dialogue models and supports more datasets. Besides, we have developed an analysis tool and an interactive tool to assist researchers in diagnosing dialogue systems. The analysis tool presents rich statistics and summarizes common mistakes from simulated dialogues, which facilitates error analysis and system improvement. The interactive tool provides a user interface that allows developers to diagnose an assembled dialogue system by interacting with the system and modifying the output of each system component.