LGAIITJun 12, 2025

Task Adaptation from Skills: Information Geometry, Disentanglement, and New Objectives for Unsupervised Reinforcement Learning

arXiv:2506.10629v113 citationsh-index: 49ICLR
Originality Incremental advance
AI Analysis

This work addresses the challenge of task adaptation in unsupervised reinforcement learning, offering incremental improvements in skill learning for AI agents.

The paper tackles the problem of unsupervised reinforcement learning (URL) by analyzing how well learned skills adapt to downstream tasks, showing that mutual information skill learning (MISL) lacks guarantees for diversity and separability, and proposing new metrics and objectives like WSEP and PWSEP that theoretically improve adaptation and discover more initial policies.

Unsupervised reinforcement learning (URL) aims to learn general skills for unseen downstream tasks. Mutual Information Skill Learning (MISL) addresses URL by maximizing the mutual information between states and skills but lacks sufficient theoretical analysis, e.g., how well its learned skills can initialize a downstream task's policy. Our new theoretical analysis in this paper shows that the diversity and separability of learned skills are fundamentally critical to downstream task adaptation but MISL does not necessarily guarantee these properties. To complement MISL, we propose a novel disentanglement metric LSEPIN. Moreover, we build an information-geometric connection between LSEPIN and downstream task adaptation cost. For better geometric properties, we investigate a new strategy that replaces the KL divergence in information geometry with Wasserstein distance. We extend the geometric analysis to it, which leads to a novel skill-learning objective WSEP. It is theoretically justified to be helpful to downstream task adaptation and it is capable of discovering more initial policies for downstream tasks than MISL. We finally propose another Wasserstein distance-based algorithm PWSEP that can theoretically discover all optimal initial policies.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes