SD LGFeb 25, 2016

PCA/LDA Approach for Text-Independent Speaker Recognition

Zhenhao Ge, Sudhendu R. Sharma, Mark J. T. Smith

arXiv:1602.08045v112 citations

AI Analysis

This work addresses efficiency and accuracy in speaker recognition for applications like security or authentication, but it is incremental as it combines existing techniques (PCA and LDA) rather than introducing a fundamentally new method.

The paper tackles text-independent speaker recognition by proposing a PCA/LDA-based approach that achieves competitive accuracy, with classification rates of 100%, 96%, and 95% for populations of 50, 100, and 200 speakers, while being faster than traditional methods like MFCC-GMM.

Various algorithms for text-independent speaker recognition have been developed through the decades, aiming to improve both accuracy and efficiency. This paper presents a novel PCA/LDA-based approach that is faster than traditional statistical model-based methods and achieves competitive results. First, the performance based on only PCA and only LDA is measured; then a mixed model, taking advantages of both methods, is introduced. A subset of the TIMIT corpus composed of 200 male speakers, is used for enrollment, validation and testing. The best results achieve 100%; 96% and 95% classification rate at population level 50; 100 and 200, using 39-dimensional MFCC features with delta and double delta. These results are based on 12-second text-independent speech for training and 4-second data for test. These are comparable to the conventional MFCC-GMM methods, but require significantly less time to train and operate.

View on arXiv PDF

Similar