AS SDMay 9, 2018

Speaker Recognition using Deep Belief Networks

Adrish Banerjee, Akash Dubey, Abhishek Menon, Shubham Nanda, Gora Chand Nandi

arXiv:1805.08865v13.314 citations

Originality Incremental advance

AI Analysis

This work addresses speaker recognition for audio processing applications, presenting an incremental improvement over existing methods.

The paper tackled speaker recognition by using deep belief networks (DBNs) to learn short-term spectral features from speech signals, combined with MFCC features, achieving a recognition accuracy of 0.95 compared to 0.90 with MFCC alone on the ELSDSR dataset.

Short time spectral features such as mel frequency cepstral coefficients(MFCCs) have been previously deployed in state of the art speaker recognition systems, however lesser heed has been paid to short term spectral features that can be learned by generative learning models from speech signals. Higher dimensional encoders such as deep belief networks (DBNs) could improve performance in speaker recognition tasks by better modelling the statistical structure of sound waves. In this paper, we use short term spectral features learnt from the DBN augmented with MFCC features to perform the task of speaker recognition. Using our features, we achieved a recognition accuracy of 0.95 as compared to 0.90 when using standalone MFCC features on the ELSDSR dataset.

View on arXiv PDF

Similar