Model Interpolation with Trans-dimensional Random Field Language Models for Speech Recognition
This work addresses speech recognition accuracy for users of English and Chinese systems, presenting an incremental improvement through model combination.
The paper tackled the problem of improving speech recognition accuracy by interpolating trans-dimensional random field language models with neural network models, achieving relative error rate reductions of 12.1% for English and 17.9% for Chinese over 6-gram language models.
The dominant language models (LMs) such as n-gram and neural network (NN) models represent sentence probabilities in terms of conditionals. In contrast, a new trans-dimensional random field (TRF) LM has been recently introduced to show superior performances, where the whole sentence is modeled as a random field. In this paper, we examine how the TRF models can be interpolated with the NN models, and obtain 12.1\% and 17.9\% relative error rate reductions over 6-gram LMs for English and Chinese speech recognition respectively through log-linear combination.