LGAIROMLApr 13, 2020

Thinking While Moving: Deep Reinforcement Learning with Concurrent Control

arXiv:2004.06089v444 citations
AI Analysis

This addresses a practical challenge for robotics and real-time systems where decision-making must occur during action execution, offering a novel approach to handle delays, though it is incremental as it builds on existing value-based methods.

The paper tackles the problem of reinforcement learning in concurrent control settings, where actions must be sampled while the system is still evolving, and introduces a new algorithmic framework based on continuous-time Bellman equations with discretization for system delays, achieving improved performance on simulated benchmarks and a large-scale robotic grasping task.

We study reinforcement learning in settings where sampling an action from the policy must be done concurrently with the time evolution of the controlled system, such as when a robot must decide on the next action while still performing the previous action. Much like a person or an animal, the robot must think and move at the same time, deciding on its next action before the previous one has completed. In order to develop an algorithmic framework for such concurrent control problems, we start with a continuous-time formulation of the Bellman equations, and then discretize them in a way that is aware of system delays. We instantiate this new class of approximate dynamic programming methods via a simple architectural extension to existing value-based deep reinforcement learning algorithms. We evaluate our methods on simulated benchmark tasks and a large-scale robotic grasping task where the robot must "think while moving".

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes