Self-Improving Skill Learning for Robust Skill-based Meta-Reinforcement Learning
This addresses the challenge of robust adaptation in long-horizon environments for meta-RL practitioners, though it appears incremental as it builds on existing skill-based approaches.
The paper tackled the problem of skill-based meta-reinforcement learning being unstable due to noisy offline demonstrations, and the result was that their proposed Self-Improving Skill Learning method achieved reliable skill learning and consistently outperformed other methods on diverse long-horizon tasks.
Meta-reinforcement learning (Meta-RL) facilitates rapid adaptation to unseen tasks but faces challenges in long-horizon environments. Skill-based approaches tackle this by decomposing state-action sequences into reusable skills and employing hierarchical decision-making. However, these methods are highly susceptible to noisy offline demonstrations, leading to unstable skill learning and degraded performance. To address this, we propose Self-Improving Skill Learning (SISL), which performs self-guided skill refinement using decoupled high-level and skill improvement policies, while applying skill prioritization via maximum return relabeling to focus updates on task-relevant trajectories, resulting in robust and stable adaptation even under noisy and suboptimal data. By mitigating the effect of noise, SISL achieves reliable skill learning and consistently outperforms other skill-based meta-RL methods on diverse long-horizon tasks.