LGAIMLFeb 13, 2024

A Distributional Analogue to the Successor Representation

arXiv:2402.08530v213 citationsh-index: 43ICML
Originality Highly original
AI Analysis

This work provides a novel method for distributional reinforcement learning, addressing risk-sensitive evaluation in AI and robotics.

The paper tackles the problem of distributional reinforcement learning by introducing the distributional successor measure, which separates transition structure and reward, and enables zero-shot risk-sensitive policy evaluation.

This paper contributes a new approach for distributional reinforcement learning which elucidates a clean separation of transition structure and reward in the learning process. Analogous to how the successor representation (SR) describes the expected consequences of behaving according to a given policy, our distributional successor measure (SM) describes the distributional consequences of this behaviour. We formulate the distributional SM as a distribution over distributions and provide theory connecting it with distributional and model-based reinforcement learning. Moreover, we propose an algorithm that learns the distributional SM from data by minimizing a two-level maximum mean discrepancy. Key to our method are a number of algorithmic techniques that are independently valuable for learning generative models of state. As an illustration of the usefulness of the distributional SM, we show that it enables zero-shot risk-sensitive policy evaluation in a way that was not previously possible.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes