LGMay 29

Multivariate Distributional Reinforcement Learning Using Sliced Divergences

arXiv:2605.3122261.4
Predicted impact top 35% in LG · last 90 daysOriginality Incremental advance
AI Analysis

This work provides a novel approach for modeling multivariate return distributions in reinforcement learning, which is significant for researchers and practitioners working on complex control problems where multiple return dimensions are relevant. It is an incremental step in DRL.

The paper addresses the challenge of extending distributional reinforcement learning (DRL) to multivariate settings, where existing metrics struggle with generalization or tractability. The authors introduce Sliced Distributional Reinforcement Learning (SDRL), which projects multivariate return distributions onto one-dimensional divergences, demonstrating Bellman contraction for uniform slicing with scalar discounting and for maximum-slicing with general dense discount matrices.

Distributional reinforcement learning (DRL) models the full return distribution rather than expectations, but extending it to multivariate settings remains challenging. Many common metrics do not naturally generalize beyond one dimension or lose computational tractability, and the multivariate case introduces additional difficulties such as general matrix discounting, for which no contraction results are available. We introduce Sliced Distributional Reinforcement Learning (SDRL), which lifts tractable one-dimensional divergences to multivariate return distributions via projections. We prove Bellman contraction for uniform slicing under shared scalar discounting, and introduce a maximum-slicing variant with contraction under general dense discount matrices. SDRL supports a broad class of base divergences; we analyze Wasserstein, Cramér, and Maximum Mean Discrepancy (MMD), and characterize which SDRL variants suit the standard single-sample Bellman update used in distributional RL. We evaluate SDRL on a toy chain problem and a gridworld image-based environment as well as a subset of Atari games.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes