ROMay 7, 2018

Using Simulation to Improve Sample-Efficiency of Bayesian Optimization for Bipedal Robots

arXiv:1805.02732v131 citations
Originality Incremental advance
AI Analysis

This work addresses sample-efficiency for roboticists tuning controllers on bipedal robots, but it is incremental as it builds on existing Bayesian optimization methods.

The paper tackles the problem of sample-inefficiency in Bayesian optimization for tuning high-dimensional controllers on bipedal robot hardware by using simulation to transform the parameter space, resulting in more reliable and sample-efficient learning across different robots and simulator fidelities.

Learning for control can acquire controllers for novel robotic tasks, paving the path for autonomous agents. Such controllers can be expert-designed policies, which typically require tuning of parameters for each task scenario. In this context, Bayesian optimization (BO) has emerged as a promising approach for automatically tuning controllers. However, when performing BO on hardware for high-dimensional policies, sample-efficiency can be an issue. Here, we develop an approach that utilizes simulation to map the original parameter space into a domain-informed space. During BO, similarity between controllers is now calculated in this transformed space. Experiments on the ATRIAS robot hardware and another bipedal robot simulation show that our approach succeeds at sample-efficiently learning controllers for multiple robots. Another question arises: What if the simulation significantly differs from hardware? To answer this, we create increasingly approximate simulators and study the effect of increasing simulation-hardware mismatch on the performance of Bayesian optimization. We also compare our approach to other approaches from literature, and find it to be more reliable, especially in cases of high mismatch. Our experiments show that our approach succeeds across different controller types, bipedal robot models and simulator fidelity levels, making it applicable to a wide range of bipedal locomotion problems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes