SDAIASMar 28, 2018

The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines

arXiv:1803.10609v1724 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of robust speech recognition in noisy, real-world settings for researchers and practitioners in speech and language processing, but it is incremental as part of an ongoing series.

The paper introduces the 5th CHiME Challenge, which tackles the problem of distant multi-microphone conversational automatic speech recognition in real home environments, using a dinner party scenario with data recorded by multiple microphone arrays and pairs to advance robust ASR technology.

The CHiME challenge series aims to advance robust automatic speech recognition (ASR) technology by promoting research at the interface of speech and language processing, signal processing , and machine learning. This paper introduces the 5th CHiME Challenge, which considers the task of distant multi-microphone conversational ASR in real home environments. Speech material was elicited using a dinner party scenario with efforts taken to capture data that is representative of natural conversational speech and recorded by 6 Kinect microphone arrays and 4 binaural microphone pairs. The challenge features a single-array track and a multiple-array track and, for each track, distinct rankings will be produced for systems focusing on robustness with respect to distant-microphone capture vs. systems attempting to address all aspects of the task including conversational language modeling. We discuss the rationale for the challenge and provide a detailed description of the data collection procedure, the task, and the baseline systems for array synchronization, speech enhancement, and conventional and end-to-end ASR.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes