LGAISep 11, 2024

The Role of Deep Learning Regularizations on Actors in Offline RL

arXiv:2409.07606v33 citationsh-index: 9
Originality Synthesis-oriented
AI Analysis

This addresses a bottleneck in offline RL for continuous control tasks, but it is incremental as it applies existing regularization methods to a new component.

The study tackled the problem of poor generalization in actor networks in offline reinforcement learning by applying standard deep learning regularization techniques, resulting in an average improvement of 6% across two algorithms and three domains.

Deep learning regularization techniques, such as dropout, layer normalization, or weight decay, are widely adopted in the construction of modern artificial neural networks, often resulting in more robust training processes and improved generalization capabilities. However, in the domain of Reinforcement Learning (RL), the application of these techniques has been limited, usually applied to value function estimators (Hiraoka et al., 2021; Smith et al., 2022), and may result in detrimental effects. This issue is even more pronounced in offline RL settings, which bear greater similarity to supervised learning but have received less attention. Recent work in continuous offline RL (Park et al., 2024) has demonstrated that while we can build sufficiently powerful critic networks, the generalization of actor networks remains a bottleneck. In this study, we empirically show that applying standard regularization techniques to actor networks in offline RL actor-critic algorithms yields improvements of 6% on average across two algorithms and three different continuous D4RL domains.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes