Tactile Active Inference Reinforcement Learning for Efficient Robotic Manipulation Skill Acquisition
This work addresses the challenge of applying robotic manipulation in diverse real-world scenarios by improving training efficiency, though it appears incremental as it builds on existing reinforcement learning and active inference methods.
The paper tackles the problem of inefficient training in robotic manipulation by proposing Tactile Active Inference Reinforcement Learning (Tactile-AIRL), which achieves high training efficiency in simulation for non-prehensile pushing tasks, surpassing the SAC baseline with few interaction episodes, and demonstrates rapid learning in physical gripper screwing experiments.
Robotic manipulation holds the potential to replace humans in the execution of tedious or dangerous tasks. However, control-based approaches are not suitable due to the difficulty of formally describing open-world manipulation in reality, and the inefficiency of existing learning methods. Thus, applying manipulation in a wide range of scenarios presents significant challenges. In this study, we propose a novel method for skill learning in robotic manipulation called Tactile Active Inference Reinforcement Learning (Tactile-AIRL), aimed at achieving efficient training. To enhance the performance of reinforcement learning (RL), we introduce active inference, which integrates model-based techniques and intrinsic curiosity into the RL process. This integration improves the algorithm's training efficiency and adaptability to sparse rewards. Additionally, we utilize a vision-based tactile sensor to provide detailed perception for manipulation tasks. Finally, we employ a model-based approach to imagine and plan appropriate actions through free energy minimization. Simulation results demonstrate that our method achieves significantly high training efficiency in non-prehensile objects pushing tasks. It enables agents to excel in both dense and sparse reward tasks with just a few interaction episodes, surpassing the SAC baseline. Furthermore, we conduct physical experiments on a gripper screwing task using our method, which showcases the algorithm's rapid learning capability and its potential for practical applications.