AS LG SDFeb 17, 2022

Curriculum optimization for low-resource speech recognition

Anastasia Kuznetsova, Anurag Kumar, Jennifer Drexler Fox, Francis Tyers

arXiv:2202.08883v12.33 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of low-resource speech recognition, which is incremental as it builds on existing curriculum learning methods.

The paper tackles the problem of suboptimal data feeding pipelines in low-resource speech recognition by proposing an automated curriculum learning approach that optimizes training sequences based on model progress and example difficulty, resulting in up to 33% relative improvement in Word Error Rate over the baseline.

Modern end-to-end speech recognition models show astonishing results in transcribing audio signals into written text. However, conventional data feeding pipelines may be sub-optimal for low-resource speech recognition, which still remains a challenging task. We propose an automated curriculum learning approach to optimize the sequence of training examples based on both the progress of the model while training and prior knowledge about the difficulty of the training examples. We introduce a new difficulty measure called compression ratio that can be used as a scoring function for raw audio in various noise conditions. The proposed method improves speech recognition Word Error Rate performance by up to 33% relative over the baseline system

View on arXiv PDF

Similar