AIJun 13, 2012

Hierarchical POMDP Controller Optimization by Likelihood Maximization

Marc Toussaint, Laurent Charlin, Pascal Poupart

arXiv:1206.3291v191 citations

Originality Incremental advance

AI Analysis

This work addresses scalability issues in hierarchical planning for partially observable environments, offering an incremental improvement over existing methods.

The paper tackles the hierarchy discovery problem in partially observable domains by transforming it into a dynamic Bayesian network and using a maximum likelihood approach to optimize policies, showing that this method scales better than previous non-convex optimization techniques.

Planning can often be simpli ed by decomposing the task into smaller tasks arranged hierarchically. Charlin et al. [4] recently showed that the hierarchy discovery problem can be framed as a non-convex optimization problem. However, the inherent computational di culty of solving such an optimization problem makes it hard to scale to realworld problems. In another line of research, Toussaint et al. [18] developed a method to solve planning problems by maximumlikelihood estimation. In this paper, we show how the hierarchy discovery problem in partially observable domains can be tackled using a similar maximum likelihood approach. Our technique rst transforms the problem into a dynamic Bayesian network through which a hierarchical structure can naturally be discovered while optimizing the policy. Experimental results demonstrate that this approach scales better than previous techniques based on non-convex optimization.

View on arXiv PDF

Similar