SDASOct 23, 2019

Low-frequency Compensated Synthetic Impulse Responses for Improved Far-field Speech Recognition

arXiv:1910.10815v312 citations
Originality Incremental advance
AI Analysis

This work addresses performance issues in far-field speech recognition systems, particularly for applications in noisy environments, but it is incremental as it builds on existing augmentation methods.

The paper tackled the problem of improving far-field speech recognition by generating low-frequency compensated synthetic impulse responses for data augmentation, resulting in up to an 8.8% reduction in word-error-rate on the LibriSpeech test set.

We propose a method for generating low-frequency compensated synthetic impulse responses that improve the performance of far-field speech recognition systems trained on artificially augmented datasets. We design linear-phase filters that adapt the simulated impulse responses to equalization distributions corresponding to real-world captured impulse responses. Our filtered synthetic impulse responses are then used to augment clean speech data from LibriSpeech dataset [1]. We evaluate the performance of our method on the real-world LibriSpeech test set. In practice, our low-frequency compensated synthetic dataset can reduce the word-error-rate by up to 8.8% for far-field speech recognition.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes