LGFeb 26, 2021

Low-Precision Reinforcement Learning: Running Soft Actor-Critic in Half Precision

arXiv:2102.13565v25 citations
AI Analysis

This work addresses the problem of reducing compute and memory costs for reinforcement learning practitioners, representing an incremental but practical advancement in making low-precision methods viable for RL.

The paper tackled the challenge of applying low-precision training to reinforcement learning, specifically for the Soft Actor-Critic agent in continuous control, and achieved matching full-precision rewards with lower memory and compute requirements through six straightforward modifications.

Low-precision training has become a popular approach to reduce compute requirements, memory footprint, and energy consumption in supervised learning. In contrast, this promising approach has not yet enjoyed similarly widespread adoption within the reinforcement learning (RL) community, partly because RL agents can be notoriously hard to train even in full precision. In this paper we consider continuous control with the state-of-the-art SAC agent and demonstrate that a naïve adaptation of low-precision methods from supervised learning fails. We propose a set of six modifications, all straightforward to implement, that leaves the underlying agent and its hyperparameters unchanged but improves the numerical stability dramatically. The resulting modified SAC agent has lower memory and compute requirements while matching full-precision rewards, demonstrating that low-precision training can substantially accelerate state-of-the-art RL without parameter tuning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes