AI MLJun 25, 2024

What type of inference is planning?

Miguel Lázaro-Gredilla, Li Yang Ku, Kevin P. Murphy, Dileep George

arXiv:2406.17863v49.67 citationsHas Code

Originality Highly original

AI Analysis

This work clarifies a foundational inconsistency in the ML/AI field regarding planning as inference, offering a unified variational perspective that disentangles inference types from approximations, which is incremental but provides new theoretical insights.

The paper tackles the inconsistency in defining 'planning as inference' in probabilistic graphical models by using a variational framework to show that planning corresponds to a specific weighting of entropy terms, enabling the application of variational inference tricks to planning. It develops an approximate planning method for factored-state MDPs and validates results on synthetic MDPs and International Planning Competition tasks, showing that previous inference types are only adequate in low-stochasticity environments.

Multiple types of inference are available for probabilistic graphical models, e.g., marginal, maximum-a-posteriori, and even marginal maximum-a-posteriori. Which one do researchers mean when they talk about "planning as inference"? There is no consistency in the literature, different types are used, and their ability to do planning is further entangled with specific approximations or additional constraints. In this work we use the variational framework to show that, just like all commonly used types of inference correspond to different weightings of the entropy terms in the variational problem, planning corresponds exactly to a different set of weights. This means that all the tricks of variational inference are readily applicable to planning. We develop an analogue of loopy belief propagation that allows us to perform approximate planning in factored-state Markov decisions processes without incurring intractability due to the exponentially large state space. The variational perspective shows that the previous types of inference for planning are only adequate in environments with low stochasticity, and allows us to characterize each type by its own merits, disentangling the type of inference from the additional approximations that its practical use requires. We validate these results empirically on synthetic MDPs and tasks posed in the International Planning Competition.

View on arXiv PDF Code

Similar