ROAILGOct 7, 2019

Policies Modulating Trajectory Generators

arXiv:1910.02812v1147 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficient and flexible control for robotics tasks with periodic motion, such as locomotion, by combining simple policies with trajectory generators, though it is incremental in building on existing methods like deep reinforcement learning and evolutionary strategies.

The authors tackled the problem of learning complex controllable behaviors for periodic motion tasks, such as quadrupedal locomotion, by proposing a Policies Modulating Trajectory Generators (PMTG) architecture. They demonstrated that a simple linear policy paired with a parametric trajectory generator could induce walking behaviors with controllable speed from 4-dimensional IMU observations, learned in under 1000 rollouts, and successfully transferred to a real robot.

We propose an architecture for learning complex controllable behaviors by having simple Policies Modulate Trajectory Generators (PMTG), a powerful combination that can provide both memory and prior knowledge to the controller. The result is a flexible architecture that is applicable to a class of problems with periodic motion for which one has an insight into the class of trajectories that might lead to a desired behavior. We illustrate the basics of our architecture using a synthetic control problem, then go on to learn speed-controlled locomotion for a quadrupedal robot by using Deep Reinforcement Learning and Evolutionary Strategies. We demonstrate that a simple linear policy, when paired with a parametric Trajectory Generator for quadrupedal gaits, can induce walking behaviors with controllable speed from 4-dimensional IMU observations alone, and can be learned in under 1000 rollouts. We also transfer these policies to a real robot and show locomotion with controllable forward velocity.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes