LG AIAug 15, 2013

Hidden Parameter Markov Decision Processes: A Semiparametric Regression Approach for Discovering Latent Task Parametrizations

arXiv:1308.3513v1141 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of flexible adaptation in control applications where tasks vary, though it appears incremental as it builds on existing MDP frameworks.

The paper tackles the problem of controlling tasks with similar but non-identical dynamics by introducing the Hidden Parameter Markov Decision Process (HiP-MDP) framework, which uses a low-dimensional set of latent factors to parametrize related systems, and shows that a learned HiP-MDP allows an agent to rapidly identify and adapt to new task dynamics.

Control applications often feature tasks with similar, but not identical, dynamics. We introduce the Hidden Parameter Markov Decision Process (HiP-MDP), a framework that parametrizes a family of related dynamical systems with a low-dimensional set of latent factors, and introduce a semiparametric regression approach for learning its structure from data. In the control setting, we show that a learned HiP-MDP rapidly identifies the dynamics of a new task instance, allowing an agent to flexibly adapt to task variations.

View on arXiv PDF

Similar