SDAIApr 26

Keyword spotting using convolutional neural network for speech recognition in Hindi

arXiv:2605.029281.7
Predicted impact top 96% in SD · last 90 daysOriginality Synthesis-oriented
AI Analysis

It addresses the need for efficient on-device keyword spotting for Hindi speech, but the results are incremental given the use of standard methods.

This study applies keyword spotting using CNNs to Hindi speech recognition, achieving 91.79% accuracy on a dataset of 40,000 audio samples.

In this study, we investigate the application of keyword spotting (KWS) in the domain of Hindi speech recognition, utilizing a dataset comprising 40,000 audio samples. With a sampling rate of 44 kHz and an average duration of 1.9 seconds per sample, we focus on developing an efficient on-device KWS system tailored for user-specific queries. Leveraging Convolutional Neural Networks (CNNs) for classification, we employ feature engineering techniques to convert raw audio recordings into Mel Frequency Cepstral Coefficients (MFCCs) as an input for our network. Our experiments encompass various CNN architectures, exploring their efficacy in identifying predefined keywords within the continuous speech stream. Our CNN-based approach achieves a commendable accuracy rate of 91.79% through rigorous evaluation, demonstrating promising performance while ensuring computational efficiency and user-specific customization in Hindi speech recognition.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes