LGMLAug 19, 2022

Spectral Decomposition Representation for Reinforcement Learning

Berkeley
arXiv:2208.09515v239 citationsh-index: 67
Originality Incremental advance
AI Analysis

This addresses the problem of representation learning in reinforcement learning for researchers and practitioners, offering a novel method that improves sample efficiency and applicability, though it builds incrementally on existing spectral approaches.

The paper tackled the limited applicability of spectral methods in reinforcement learning by proposing SPEDER, a spectral decomposition representation that extracts state-action abstractions without policy dependence and balances exploration-exploitation, achieving superior performance over state-of-the-art algorithms in benchmarks.

Representation learning often plays a critical role in reinforcement learning by managing the curse of dimensionality. A representative class of algorithms exploits a spectral decomposition of the stochastic transition dynamics to construct representations that enjoy strong theoretical properties in an idealized setting. However, current spectral methods suffer from limited applicability because they are constructed for state-only aggregation and derived from a policy-dependent transition kernel, without considering the issue of exploration. To address these issues, we propose an alternative spectral method, Spectral Decomposition Representation (SPEDER), that extracts a state-action abstraction from the dynamics without inducing spurious dependence on the data collection policy, while also balancing the exploration-versus-exploitation trade-off during learning. A theoretical analysis establishes the sample efficiency of the proposed algorithm in both the online and offline settings. In addition, an experimental investigation demonstrates superior performance over current state-of-the-art algorithms across several benchmarks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes