From semantics to execution: Integrating action planning with reinforcement learning for robotic causal problem-solving
This work addresses the challenge of combining discrete planning with continuous control for robotics, which is incremental as it builds on existing methods like universal value function approximators.
The authors tackled the problem of integrating symbolic action planning with reinforcement learning for robotic manipulation, demonstrating that their neuro-symbolic method can solve object manipulation tasks involving tool use and causal dependencies under noisy conditions.
Reinforcement learning is an appropriate and successful method to robustly perform low-level robot control under noisy conditions. Symbolic action planning is useful to resolve causal dependencies and to break a causally complex problem down into a sequence of simpler high-level actions. A problem with the integration of both approaches is that action planning is based on discrete high-level action- and state spaces, whereas reinforcement learning is usually driven by a continuous reward function. However, recent advances in reinforcement learning, specifically, universal value function approximators and hindsight experience replay, have focused on goal-independent methods based on sparse rewards. In this article, we build on these novel methods to facilitate the integration of action planning with reinforcement learning by exploiting the reward-sparsity as a bridge between the high-level and low-level state- and control spaces. As a result, we demonstrate that the integrated neuro-symbolic method is able to solve object manipulation problems that involve tool use and non-trivial causal dependencies under noisy conditions, exploiting both data and knowledge.