LGMay 4, 2022

State Representation Learning for Goal-Conditioned Reinforcement Learning

arXiv:2205.01965v113 citationsh-index: 26
Originality Incremental advance
AI Analysis

This provides a reward-free approach for improving goal-conditioned policies in reinforcement learning, though it appears incremental as it builds on existing representation learning methods.

The paper tackles the problem of learning state representations for goal-conditioned reinforcement learning without rewards, by developing a self-supervised method that embeds states based on the minimum action transitions between them, and demonstrates its effectiveness in control and multi-goal environments.

This paper presents a novel state representation for reward-free Markov decision processes. The idea is to learn, in a self-supervised manner, an embedding space where distances between pairs of embedded states correspond to the minimum number of actions needed to transition between them. Compared to previous methods, our approach does not require any domain knowledge, learning from offline and unlabeled data. We show how this representation can be leveraged to learn goal-conditioned policies, providing a notion of similarity between states and goals and a useful heuristic distance to guide planning and reinforcement learning algorithms. Finally, we empirically validate our method in classic control domains and multi-goal environments, demonstrating that our method can successfully learn representations in large and/or continuous domains.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes