LGAIDec 29, 2022

Invariance to Quantile Selection in Distributional Continuous Control

arXiv:2212.14262v11 citationsh-index: 25
Originality Synthesis-oriented
AI Analysis

This work addresses the applicability of distributional RL methods in continuous control for robotics and simulation, but it is incremental as it adapts existing algorithms to a new domain without major innovations.

The authors transferred three distributional reinforcement learning algorithms (QR-DQN, IQN, FQF) to continuous action domains by integrating them with actor-critic methods (TD3, SAC), and found that performance was qualitatively invariant to the number and placement of distributional atoms in deterministic continuous control tasks.

In recent years distributional reinforcement learning has produced many state of the art results. Increasingly sample efficient Distributional algorithms for the discrete action domain have been developed over time that vary primarily in the way they parameterize their approximations of value distributions, and how they quantify the differences between those distributions. In this work we transfer three of the most well-known and successful of those algorithms (QR-DQN, IQN and FQF) to the continuous action domain by extending two powerful actor-critic algorithms (TD3 and SAC) with distributional critics. We investigate whether the relative performance of the methods for the discrete action space translates to the continuous case. To that end we compare them empirically on the pybullet implementations of a set of continuous control tasks. Our results indicate qualitative invariance regarding the number and placement of distributional atoms in the deterministic, continuous action setting.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes