Lifelong Robotic Reinforcement Learning by Retaining Experiences
This addresses the practical constraint of sequential task learning for physical robotic systems, enabling more efficient skill acquisition without round-robin data collection.
The paper tackles the problem of sequential multi-task reinforcement learning for robots, where tasks arrive one after another, and presents an approach that reuses data and policies from previous tasks to learn new ones more efficiently. In simulated robotic manipulation experiments, their method requires less than half the samples compared to learning each task from scratch and successfully learns ten challenging tasks on a real robot arm.
Multi-task learning ideally allows robots to acquire a diverse repertoire of useful skills. However, many multi-task reinforcement learning efforts assume the robot can collect data from all tasks at all times. In reality, the tasks that the robot learns arrive sequentially, depending on the user and the robot's current environment. In this work, we study a practical sequential multi-task RL problem that is motivated by the practical constraints of physical robotic systems, and derive an approach that effectively leverages the data and policies learned for previous tasks to cumulatively grow the robot's skill-set. In a series of simulated robotic manipulation experiments, our approach requires less than half the samples than learning each task from scratch, while avoiding impractical round-robin data collection. On a Franka Emika Panda robot arm, our approach incrementally learns ten challenging tasks, including bottle capping and block insertion.