LOTUSDIS: A Thai far-field meeting corpus for robust conversational ASR
This addresses the problem of degraded ASR performance in far-field conditions for Thai language applications, though it is incremental as it focuses on dataset creation and fine-tuning rather than novel algorithmic breakthroughs.
The researchers tackled the problem of robust conversational automatic speech recognition (ASR) for Thai far-field meetings by creating LOTUSDIS, a publicly available dataset of 114 hours of spontaneous dialogue with overlapping speech recorded at distances up to 10 meters. Fine-tuning a Thai Whisper model on this dataset dramatically improved robustness, reducing overall word error rate (WER) from 64.3 to 38.3 and far-field WER from 81.6 to 49.5.
We present LOTUSDIS, a publicly available Thai meeting corpus designed to advance far-field conversational ASR. The dataset comprises 114 hours of spontaneous, unscripted dialogue collected in 15-20 minute sessions with three participants, where overlapping speech is frequent and natural. Speech was recorded simultaneously by nine independent single-channel devices spanning six microphone types at distances from 0.12 m to 10 m, preserving the authentic effects of reverberation, noise, and device coloration without relying on microphone arrays. We provide standard train, dev, test splits and release a reproducible baseline system. We benchmarked several Whisper variants under zero-shot and fine-tuned conditions. Off-the-shelf models showed strong degradation with distance, confirming a mismatch between pre-training data and Thai far-field speech. Fine-tuning on LOTUSDIS dramatically improved robustness: a Thai Whisper baseline reduced overall WER from 64.3 to 38.3 and far-field WER from 81.6 to 49.5, with especially large gains on the most distant microphones. These results underscore the importance of distance-diverse training data for robust ASR. The corpus is available under CC-BY-SA 4.0. We also release training and evaluation scripts as a baseline system to promote reproducible research in this field.