CasIL: Cognizing and Imitating Skills via a Dual Cognition-Action Architecture
This addresses the problem of sub-optimal imitation learning in complex robotic tasks for robotics researchers, though it appears incremental as it builds on existing imitation learning approaches.
The paper tackled the challenge of enabling robots to effectively imitate expert skills in long-horizon tasks by proposing CasIL, a dual cognition-action architecture that incorporates human cognitive priors, and it achieved competitive and robust performance on benchmarks like MuJoCo and RLBench.
Enabling robots to effectively imitate expert skills in longhorizon tasks such as locomotion, manipulation, and more, poses a long-standing challenge. Existing imitation learning (IL) approaches for robots still grapple with sub-optimal performance in complex tasks. In this paper, we consider how this challenge can be addressed within the human cognitive priors. Heuristically, we extend the usual notion of action to a dual Cognition (high-level)-Action (low-level) architecture by introducing intuitive human cognitive priors, and propose a novel skill IL framework through human-robot interaction, called Cognition-Action-based Skill Imitation Learning (CasIL), for the robotic agent to effectively cognize and imitate the critical skills from raw visual demonstrations. CasIL enables both cognition and action imitation, while high-level skill cognition explicitly guides low-level primitive actions, providing robustness and reliability to the entire skill IL process. We evaluated our method on MuJoCo and RLBench benchmarks, as well as on the obstacle avoidance and point-goal navigation tasks for quadrupedal robot locomotion. Experimental results show that our CasIL consistently achieves competitive and robust skill imitation capability compared to other counterparts in a variety of long-horizon robotic tasks.