CLSep 30, 2025

ASR Under Noise: Exploring Robustness for Sundanese and Javanese

Salsabila Zahirah Pranida, Muhammad Cendekia Airlangga, Rifo Ahmad Genadi, Shady Shehata

arXiv:2509.25878v11 citationsh-index: 6Proceedings of the 9th Widening NLP Workshop

Originality Synthesis-oriented

AI Analysis

This addresses ASR robustness for specific Indonesian regional languages, but it is incremental as it applies existing methods to new data.

The paper tackled the problem of automatic speech recognition (ASR) robustness under noise for Sundanese and Javanese languages, finding that noise-aware training substantially improves performance, especially for larger Whisper models, with evaluations across signal-to-noise ratios.

We investigate the robustness of Whisper-based automatic speech recognition (ASR) models for two major Indonesian regional languages: Javanese and Sundanese. While recent work has demonstrated strong ASR performance under clean conditions, their effectiveness in noisy environments remains unclear. To address this, we experiment with multiple training strategies, including synthetic noise augmentation and SpecAugment, and evaluate performance across a range of signal-to-noise ratios (SNRs). Our results show that noise-aware training substantially improves robustness, particularly for larger Whisper models. A detailed error analysis further reveals language-specific challenges, highlighting avenues for future improvements

View on arXiv PDF

Similar