LG MA MLApr 17, 2019

PLOTS: Procedure Learning from Observations using Subtask Structure

arXiv:1904.09162v11.83 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of procedural learning from observation for intelligent agents, allowing better utilization of video data, though it is incremental as it builds on existing methods with specific optimizations.

The paper tackles the problem of learning procedures from a single observed trajectory, enabling agents to mimic demonstrated sequences efficiently. The method achieves about 100 times faster learning compared to policy-gradient approaches and shows speed improvements with optimistic action selection when latent structure exists.

In many cases an intelligent agent may want to learn how to mimic a single observed demonstrated trajectory. In this work we consider how to perform such procedural learning from observation, which could help to enable agents to better use the enormous set of video data on observation sequences. Our approach exploits the properties of this setting to incrementally build an open loop action plan that can yield the desired subsequence, and can be used in both Markov and partially observable Markov domains. In addition, procedures commonly involve repeated extended temporal action subsequences. Our method optimistically explores actions to leverage potential repeated structure in the procedure. In comparing to some state-of-the-art approaches we find that our explicit procedural learning from observation method is about 100 times faster than policy-gradient based approaches that learn a stochastic policy and is faster than model based approaches as well. We also find that performing optimistic action selection yields substantial speed ups when latent dynamical structure is present.

View on arXiv PDF

Similar