CL LGNov 27, 2025

Supplementary Resources and Analysis for Automatic Speech Recognition Systems Trained on the Loquacious Dataset

Nick Rossenbach, Robin Schmitt, Tina Raissi, Simon Berger, Larissa Kleppel, Ralf Schlüter

arXiv:2512.17915v12.71 citations

Originality Synthesis-oriented

AI Analysis

This work provides incremental resources to support a new dataset for researchers and practitioners in speech recognition, facilitating benchmarking and usability.

The authors tackled the need for a comprehensive benchmark in automatic speech recognition by providing supplementary resources like language models and pronunciation lexica for the Loquacious dataset, showing through experiments that it serves as a valuable study case for various ASR challenges.

The recently published Loquacious dataset aims to be a replacement for established English automatic speech recognition (ASR) datasets such as LibriSpeech or TED-Lium. The main goal of the Loquacious dataset is to provide properly defined training and test partitions across many acoustic and language domains, with an open license suitable for both academia and industry. To further promote the benchmarking and usability of this new dataset, we present additional resources in the form of n-gram language models (LMs), a grapheme-to-phoneme (G2P) model and pronunciation lexica, with open and public access. Utilizing those additional resources we show experimental results across a wide range of ASR architectures with different label units and topologies. Our initial experimental results indicate that the Loquacious dataset offers a valuable study case for a variety of common challenges in ASR.

View on arXiv PDF

Similar