Human-Machine Interaction Speech Corpus from the ROBIN project
This work provides a domain-specific resource for improving human-machine interaction in Romanian, but it is incremental as it applies existing methods to new data.
The paper introduces the ROBINTASC corpus, a Romanian speech dataset aimed at enhancing conversational agents for technical equipment purchases, and reports its positive impact on both low-latency ASR and dialogue systems.
This paper introduces a new Romanian speech corpus from the ROBIN project, called ROBIN Technical Acquisition Speech Corpus (ROBINTASC). Its main purpose was to improve the behaviour of a conversational agent, allowing human-machine interaction in the context of purchasing technical equipment. The paper contains a detailed description of the acquisition process, corpus statistics as well as an evaluation of the corpus influence on a low-latency ASR system as well as a dialogue component.