Feature Learning with Gaussian Restricted Boltzmann Machine for Robust Speech Recognition
This work addresses robust speech recognition for applications in noisy environments, but it is incremental as it builds on existing GRBM methods.
The authors tackled robust speech recognition by proposing a new variant called multivariate Gaussian restricted Boltzmann machine (MGRBM) and using it to extract features, resulting in much better performance than MFCC on the Aurora2 dataset, with MGRBM slightly outperforming GRBM.
In this paper, we first present a new variant of Gaussian restricted Boltzmann machine (GRBM) called multivariate Gaussian restricted Boltzmann machine (MGRBM), with its definition and learning algorithm. Then we propose using a learned GRBM or MGRBM to extract better features for robust speech recognition. Our experiments on Aurora2 show that both GRBM-extracted and MGRBM-extracted feature performs much better than Mel-frequency cepstral coefficient (MFCC) with either HMM-GMM or hybrid HMM-deep neural network (DNN) acoustic model, and MGRBM-extracted feature is slightly better.