Curious Meta-Controller: Adaptive Alternation between Model-Based and Model-Free Control in Deep Reinforcement Learning
This addresses the problem of high sample requirements in model-free RL for robotics, offering a hybrid solution that is incremental but practical for real-world applications.
The paper tackles the trade-off between sample efficiency and representational limitations in deep reinforcement learning for continuous control by proposing the Curious Meta-Controller, which adaptively alternates between model-based and model-free control using curiosity feedback, achieving near-optimal performance and improved sample efficiency on robotic reaching and grasping tasks from raw-pixel input.
Recent success in deep reinforcement learning for continuous control has been dominated by model-free approaches which, unlike model-based approaches, do not suffer from representational limitations in making assumptions about the world dynamics and model errors inevitable in complex domains. However, they require a lot of experiences compared to model-based approaches that are typically more sample-efficient. We propose to combine the benefits of the two approaches by presenting an integrated approach called Curious Meta-Controller. Our approach alternates adaptively between model-based and model-free control using a curiosity feedback based on the learning progress of a neural model of the dynamics in a learned latent space. We demonstrate that our approach can significantly improve the sample efficiency and achieve near-optimal performance on learning robotic reaching and grasping tasks from raw-pixel input in both dense and sparse reward settings.