Modified Mel Filter Bank to Compute MFCC of Subsampled Speech
This work addresses speech and speaker recognition applications where subsampled speech is used, but it appears incremental as it modifies an existing filter bank.
The authors tackled the problem of extracting MFCCs from subsampled speech by proposing a modified Mel filter bank and a stronger correlation metric, resulting in recognition performance on resampled speech close to that on original speech.
Mel Frequency Cepstral Coefficients (MFCCs) are the most popularly used speech features in most speech and speaker recognition applications. In this work, we propose a modified Mel filter bank to extract MFCCs from subsampled speech. We also propose a stronger metric which effectively captures the correlation between MFCCs of original speech and MFCC of resampled speech. It is found that the proposed method of filter bank construction performs distinguishably well and gives recognition performance on resampled speech close to recognition accuracies on original speech.