AS CL SDJan 20, 2021

VOTE400(Voide Of The Elderly 400 Hours): A Speech Dataset to Study Voice Interface for Elderly-Care

Minsu Jang, Sangwon Seo, Dohyung Kim, Jaeyeon Lee, Jaehong Kim, Jun-Hwan Ahn

arXiv:2101.11469v13.33 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the need for improved voice interfaces in elderly-care robots, though it is incremental as it focuses on dataset creation rather than novel algorithmic advances.

The paper tackles the problem of speech recognition for elderly voices by introducing VOTE400, a large-scale Korean speech dataset with 400 hours of recordings from people aged 65 or over, and shows that a system trained on it outperforms conventional ones in recognizing elderly speech.

This paper introduces a large-scale Korean speech dataset, called VOTE400, that can be used for analyzing and recognizing voices of the elderly people. The dataset includes about 300 hours of continuous dialog speech and 100 hours of read speech, both recorded by the elderly people aged 65 years or over. A preliminary experiment showed that speech recognition system trained with VOTE400 can outperform conventional systems in speech recognition of elderly people's voice. This work is a multi-organizational effort led by ETRI and MINDs Lab Inc. for the purpose of advancing the speech recognition performance of the elderly-care robots.

View on arXiv PDF

Similar