Manifesto from Dagstuhl Perspectives Workshop 24352 -- Conversational Agents: A Framework for Evaluation (CAFE)
This provides a structured evaluation framework for researchers and developers in conversational AI, but it is incremental as it builds on existing workshop discussions.
The paper tackles the problem of evaluating conversational information access systems by proposing a framework called CAFE, which defines six components for systematic evaluation, including stakeholder goals and user tasks.
During the workshop, we deeply discussed what CONversational Information ACcess (CONIAC) is and its unique features, proposing a world model abstracting it, and defined the Conversational Agents Framework for Evaluation (CAFE) for the evaluation of CONIAC systems, consisting of six major components: 1) goals of the system's stakeholders, 2) user tasks to be studied in the evaluation, 3) aspects of the users carrying out the tasks, 4) evaluation criteria to be considered, 5) evaluation methodology to be applied, and 6) measures for the quantitative criteria chosen.