Marginal Inference queries in Hidden Markov Models under context-free grammar constraints
This addresses a sophisticated marginal inference problem for sequential probabilistic models, relevant to computational linguistics and NLP, but is incremental as it builds on existing HMM and CFG frameworks.
The paper tackles the problem of computing the likelihood of context-free grammars in Hidden Markov Models, showing it is NP-Hard even for low ambiguity and providing an exact dynamic algorithm for unambiguous grammars and an FPRAS for polynomially-bounded ambiguous ones.
The primary use of any probabilistic model involving a set of random variables is to run inference and sampling queries on it. Inference queries in classical probabilistic models is concerned by the computation of marginal or conditional probabilities of events given as an input. When the probabilistic model is sequential, more sophisticated marginal inference queries involving complex grammars may be of interest in fields such as computational linguistics and NLP. In this work, we address the question of computing the likelihood of context-free grammars (CFGs) in Hidden Markov Models (HMMs). We provide a dynamic algorithm for the exact computation of the likelihood for the class of unambiguous context-free grammars. We show that the problem is NP-Hard, even with the promise that the input CFG has a degree of ambiguity less than or equal to 2. We then propose a fully polynomial randomized approximation scheme (FPRAS) algorithm to approximate the likelihood for the case of polynomially-bounded ambiguous CFGs.