LGMLOct 1, 2020

Bayesian Policy Search for Stochastic Domains

arXiv:2010.00284v1
Originality Incremental advance
AI Analysis

This work provides a Bayesian method for policy search in stochastic domains, which is incremental as it builds on prior probabilistic programming techniques.

The authors tackled policy search in stochastic domains by framing it as a Bayesian inference problem and encoding it as nested probabilistic programs, using an adapted Lightweight Metropolis-Hastings algorithm, and showed that policies of similar quality are learned with a simpler and more general approach.

AI planning can be cast as inference in probabilistic models, and probabilistic programming was shown to be capable of policy search in partially observable domains. Prior work introduces policy search through Markov chain Monte Carlo in deterministic domains, as well as adapts black-box variational inference to stochastic domains, however not in the strictly Bayesian sense. In this work, we cast policy search in stochastic domains as a Bayesian inference problem and provide a scheme for encoding such problems as nested probabilistic programs. We argue that probabilistic programs for policy search in stochastic domains should involve nested conditioning, and provide an adaption of Lightweight Metropolis-Hastings (LMH) for robust inference in such programs. We apply the proposed scheme to stochastic domains and show that policies of similar quality are learned, despite a simpler and more general inference algorithm. We believe that the proposed variant of LMH is novel and applicable to a wider class of probabilistic programs with nested conditioning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes