CLLGApr 28, 2022

EVI: Multilingual Spoken Dialogue Tasks and Dataset for Knowledge-Based Enrolment, Verification, and Identification

arXiv:2204.13496v1629 citationsh-index: 55
Originality Synthesis-oriented
AI Analysis

This work tackles authentication for personalized, privacy-focused spoken dialogue systems, providing a foundational dataset and benchmarks for multilingual research.

The authors formalized knowledge-based authentication tasks (enrolment, verification, identification) for spoken dialogue systems and introduced EVI, a multilingual dataset with 5,506 dialogues in English, Polish, and French. Their models established the first competitive benchmarks, addressing challenges in multilingual spoken dialogue processing.

Knowledge-based authentication is crucial for task-oriented spoken dialogue systems that offer personalised and privacy-focused services. Such systems should be able to enrol (E), verify (V), and identify (I) new and recurring users based on their personal information, e.g. postcode, name, and date of birth. In this work, we formalise the three authentication tasks and their evaluation protocols, and we present EVI, a challenging spoken multilingual dataset with 5,506 dialogues in English, Polish, and French. Our proposed models set the first competitive benchmarks, explore the challenges of multilingual natural language processing of spoken dialogue, and set directions for future research.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes