LGAIMLFeb 10, 2020

Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills

arXiv:2002.03647v4181 citationsHas Code
AI Analysis

This addresses a key limitation in unsupervised reinforcement learning for skill discovery, offering improved state coverage and flexibility, though it is incremental as it builds on existing empowerment-based methods.

The paper identifies that existing information-theoretic skill discovery algorithms in reinforcement learning produce options with poor state space coverage, and proposes EDL, an alternative method that overcomes this limitation while optimizing the same objective, showing significant advantages in controlled environments.

Acquiring abilities in the absence of a task-oriented reward function is at the frontier of reinforcement learning research. This problem has been studied through the lens of empowerment, which draws a connection between option discovery and information theory. Information-theoretic skill discovery methods have garnered much interest from the community, but little research has been conducted in understanding their limitations. Through theoretical analysis and empirical evidence, we show that existing algorithms suffer from a common limitation -- they discover options that provide a poor coverage of the state space. In light of this, we propose 'Explore, Discover and Learn' (EDL), an alternative approach to information-theoretic skill discovery. Crucially, EDL optimizes the same information-theoretic objective derived from the empowerment literature, but addresses the optimization problem using different machinery. We perform an extensive evaluation of skill discovery methods on controlled environments and show that EDL offers significant advantages, such as overcoming the coverage problem, reducing the dependence of learned skills on the initial state, and allowing the user to define a prior over which behaviors should be learned. Code is publicly available at https://github.com/victorcampos7/edl.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes