CVJun 12, 2012

A Novel Windowing Technique for Efficient Computation of MFCC for Speaker Recognition

arXiv:1206.2437v187 citations
Originality Incremental advance
AI Analysis

This work addresses speaker recognition, an incremental improvement in signal processing for speech analysis.

The authors tackled the problem of improving speaker recognition by proposing a novel windowing technique for computing Mel Frequency Cepstral Coefficients (MFCC), which achieved substantial and consistent performance improvements over baseline and recent methods.

In this paper, we propose a novel family of windowing technique to compute Mel Frequency Cepstral Coefficient (MFCC) for automatic speaker recognition from speech. The proposed method is based on fundamental property of discrete time Fourier transform (DTFT) related to differentiation in frequency domain. Classical windowing scheme such as Hamming window is modified to obtain derivatives of discrete time Fourier transform coefficients. It has been mathematically shown that the slope and phase of power spectrum are inherently incorporated in newly computed cepstrum. Speaker recognition systems based on our proposed family of window functions are shown to attain substantial and consistent performance improvement over baseline single tapered Hamming window as well as recently proposed multitaper windowing technique.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes