LGCLNEMLJun 1, 2017

Transfer Learning for Speech Recognition on a Budget

arXiv:1706.00290v11137 citations
Originality Incremental advance
AI Analysis

This work addresses the high cost of training ASR models for new languages, making it more accessible, though it is incremental as it builds on existing transfer learning methods.

The paper tackles the problem of training automated speech recognition (ASR) systems with limited resources by using transfer learning to adapt an English-trained model to German, achieving faster training on consumer-grade hardware and requiring less data for the same accuracy.

End-to-end training of automated speech recognition (ASR) systems requires massive data and compute resources. We explore transfer learning based on model adaptation as an approach for training ASR models under constrained GPU memory, throughput and training data. We conduct several systematic experiments adapting a Wav2Letter convolutional neural network originally trained for English ASR to the German language. We show that this technique allows faster training on consumer-grade resources while requiring less training data in order to achieve the same accuracy, thereby lowering the cost of training ASR models in other languages. Model introspection revealed that small adaptations to the network's weights were sufficient for good performance, especially for inner layers.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes