Dynamic Bayesian Multinets
This work addresses speech recognition classification by proposing an incremental improvement in model structure for better performance.
The paper tackles the problem of classification in speech recognition by introducing dynamic Bayesian multinets that use discriminative network structures to approximate class posterior probabilities, resulting in models that outperform HMMs and other dynamic Bayesian networks with similar parameters on an isolated-word speech recognition task.
In this work, dynamic Bayesian multinets are introduced where a Markov chain state at time t determines conditional independence patterns between random variables lying within a local time window surrounding t. It is shown how information-theoretic criterion functions can be used to induce sparse, discriminative, and class-conditional network structures that yield an optimal approximation to the class posterior probability, and therefore are useful for the classification task. Using a new structure learning heuristic, the resulting models are tested on a medium-vocabulary isolated-word speech recognition task. It is demonstrated that these discriminatively structured dynamic Bayesian multinets, when trained in a maximum likelihood setting using EM, can outperform both HMMs and other dynamic Bayesian networks with a similar number of parameters.