ROLGOct 21, 2019

Modelling Generalized Forces with Reinforcement Learning for Sim-to-Real Transfer

arXiv:1910.09471v123 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of sim-to-real transfer for robotic control, which is crucial for deploying robots in real-world applications, though it is incremental as it builds on existing reinforcement learning and system identification methods.

The paper tackles the problem of inaccurate simulation models for robotic control by proposing a framework that optimizes state-dependent generalized forces using reinforcement learning, achieving improved sim-to-real policy transfer with only minutes of real-world data, as demonstrated on a Sawyer robot in a nonprehensile manipulation task.

Learning robotic control policies in the real world gives rise to challenges in data efficiency, safety, and controlling the initial condition of the system. On the other hand, simulations are a useful alternative as they provide an abundant source of data without the restrictions of the real world. Unfortunately, simulations often fail to accurately model complex real-world phenomena. Traditional system identification techniques are limited in expressiveness by the analytical model parameters, and usually are not sufficient to capture such phenomena. In this paper we propose a general framework for improving the analytical model by optimizing state dependent generalized forces. State dependent generalized forces are expressive enough to model constraints in the equations of motion, while maintaining a clear physical meaning and intuition. We use reinforcement learning to efficiently optimize the mapping from states to generalized forces over a discounted infinite horizon. We show that using only minutes of real world data improves the sim-to-real control policy transfer. We demonstrate the feasibility of our approach by validating it on a nonprehensile manipulation task on the Sawyer robot.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes