LGMar 13, 2021

Solving Compositional Reinforcement Learning Problems via Task Reduction

Yunfei Li, Yilin Wu, Huazhe Xu, Xiaolong Wang, Yi Wu

arXiv:2103.07607v213.120 citationsHas Code

Originality Highly original

AI Analysis

This work provides a new learning paradigm for reinforcement learning agents to solve complex compositional tasks more efficiently, particularly in sparse-reward continuous-control environments.

The paper introduces Self-Imitation via Reduction (SIR), a method for compositional reinforcement learning. SIR addresses hard tasks by reducing them to easier, already-solved tasks, then uses the generated solution trajectories for self-imitation, which significantly accelerates and improves learning on sparse-reward continuous-control problems.

We propose a novel learning paradigm, Self-Imitation via Reduction (SIR), for solving compositional reinforcement learning problems. SIR is based on two core ideas: task reduction and self-imitation. Task reduction tackles a hard-to-solve task by actively reducing it to an easier task whose solution is known by the RL agent. Once the original hard task is successfully solved by task reduction, the agent naturally obtains a self-generated solution trajectory to imitate. By continuously collecting and imitating such demonstrations, the agent is able to progressively expand the solved subspace in the entire task space. Experiment results show that SIR can significantly accelerate and improve learning on a variety of challenging sparse-reward continuous-control problems with compositional structures. Code and videos are available at https://sites.google.com/view/sir-compositional.

View on arXiv PDF Code

Similar