CLCVIRMMAug 27, 2022

Minimal Feature Analysis for Isolated Digit Recognition for varying encoding rates in noisy environments

arXiv:2208.13100v11 citationsh-index: 20
Originality Synthesis-oriented
AI Analysis

This work addresses speech recognition robustness for real-time applications in noisy environments, but it is incremental as it applies existing methods to new data combinations.

The study analyzed isolated digit recognition under varying bit rates and noise levels, finding that MFCC features performed best with a 16 kHz sampling rate and 16-bit encoding in noisy conditions.

This research work is about recent development made in speech recognition. In this research work, analysis of isolated digit recognition in the presence of different bit rates and at different noise levels has been performed. This research work has been carried using audacity and HTK toolkit. Hidden Markov Model (HMM) is the recognition model which was used to perform this experiment. The feature extraction techniques used are Mel Frequency Cepstrum coefficient (MFCC), Linear Predictive Coding (LPC), perceptual linear predictive (PLP), mel spectrum (MELSPEC), filter bank (FBANK). There were three types of different noise levels which have been considered for testing of data. These include random noise, fan noise and random noise in real time environment. This was done to analyse the best environment which can used for real time applications. Further, five different types of commonly used bit rates at different sampling rates were considered to find out the most optimum bit rate.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes