LG CL MLOct 11, 2013

A Bayesian Network View on Acoustic Model-Based Techniques for Robust Speech Recognition

Roland Maas, Christian Huemmer, Armin Sehr, Walter Kellermann

arXiv:1310.3099v2

AI Analysis

This work offers a theoretical unification for researchers in speech recognition, but it is incremental as it synthesizes known methods without introducing new empirical results.

The paper provides a unifying Bayesian network framework to analyze and compare existing acoustic model adaptation, missing feature, and uncertainty decoding techniques for robust speech recognition, deriving compensation rules and highlighting structural differences.

This article provides a unifying Bayesian network view on various approaches for acoustic model adaptation, missing feature, and uncertainty decoding that are well-known in the literature of robust automatic speech recognition. The representatives of these classes can often be deduced from a Bayesian network that extends the conventional hidden Markov models used in speech recognition. These extensions, in turn, can in many cases be motivated from an underlying observation model that relates clean and distorted feature vectors. By converting the observation models into a Bayesian network representation, we formulate the corresponding compensation rules leading to a unified view on known derivations as well as to new formulations for certain approaches. The generic Bayesian perspective provided in this contribution thus highlights structural differences and similarities between the analyzed approaches.

View on arXiv PDF

Similar