ROAILGOct 24, 2017

Fast Model Identification via Physics Engines for Data-Efficient Policy Search

arXiv:1710.08893v321 citations
Originality Incremental advance
AI Analysis

This work addresses data-efficiency challenges in robotics for researchers and practitioners, but it is incremental as it adapts existing methods like Bayesian optimization and physics engines.

The paper tackles the problem of identifying mechanical parameters for robots or objects to reduce real-world experiments in model-based reinforcement learning, resulting in a strategy that significantly improves data-efficiency of policy search algorithms.

This paper presents a method for identifying mechanical parameters of robots or objects, such as their mass and friction coefficients. Key features are the use of off-the-shelf physics engines and the adaptation of a Bayesian optimization technique towards minimizing the number of real-world experiments needed for model-based reinforcement learning. The proposed framework reproduces in a physics engine experiments performed on a real robot and optimizes the model's mechanical parameters so as to match real-world trajectories. The optimized model is then used for learning a policy in simulation, before real-world deployment. It is well understood, however, that it is hard to exactly reproduce real trajectories in simulation. Moreover, a near-optimal policy can be frequently found with an imperfect model. Therefore, this work proposes a strategy for identifying a model that is just good enough to approximate the value of a locally optimal policy with a certain confidence, instead of wasting effort on identifying the most accurate model. Evaluations, performed both in simulation and on a real robotic manipulation task, indicate that the proposed strategy results in an overall time-efficient, integrated model identification and learning solution, which significantly improves the data-efficiency of existing policy search algorithms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes