LGAIROJul 21, 2023

Offline Diversity Maximization Under Imitation Constraints

arXiv:2307.11373v34 citationsh-index: 10
Originality Incremental advance
AI Analysis

This work addresses the challenge of leveraging offline data for skill discovery in robotics, offering a practical solution for real-world applications, though it is incremental in combining existing concepts.

The paper tackled the problem of unsupervised skill discovery requiring online interaction and lacking skill utility measures by proposing an offline algorithm that maximizes diversity while ensuring skills imitate expert demonstrations, demonstrating effectiveness on D4RL and a custom quadruped robot dataset with successful simulation-to-real transfer.

There has been significant recent progress in the area of unsupervised skill discovery, utilizing various information-theoretic objectives as measures of diversity. Despite these advances, challenges remain: current methods require significant online interaction, fail to leverage vast amounts of available task-agnostic data and typically lack a quantitative measure of skill utility. We address these challenges by proposing a principled offline algorithm for unsupervised skill discovery that, in addition to maximizing diversity, ensures that each learned skill imitates state-only expert demonstrations to a certain degree. Our main analytical contribution is to connect Fenchel duality, reinforcement learning, and unsupervised skill discovery to maximize a mutual information objective subject to KL-divergence state occupancy constraints. Furthermore, we demonstrate the effectiveness of our method on the standard offline benchmark D4RL and on a custom offline dataset collected from a 12-DoF quadruped robot for which the policies trained in simulation transfer well to the real robotic system.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes