LGSYApr 27, 2025

HyperController: A Hyperparameter Controller for Fast and Stable Training of Reinforcement Learning Neural Networks

arXiv:2504.19382v13 citationsh-index: 22Has Code
Originality Incremental advance
AI Analysis

This addresses the problem of slow and unstable training in reinforcement learning for researchers and practitioners, though it appears incremental as it builds on existing hyperparameter optimization methods with a specific adaptation.

The paper tackles hyperparameter optimization in reinforcement learning by introducing HyperController, which models the problem as a Linear Gaussian Dynamical System and uses a Kalman filter for efficient learning, achieving the highest median reward in four out of five OpenAI Gymnasium environments compared to other algorithms.

We introduce Hyperparameter Controller (HyperController), a computationally efficient algorithm for hyperparameter optimization during training of reinforcement learning neural networks. HyperController optimizes hyperparameters quickly while also maintaining improvement of the reinforcement learning neural network, resulting in faster training and deployment. It achieves this by modeling the hyperparameter optimization problem as an unknown Linear Gaussian Dynamical System, which is a system with a state that linearly changes. It then learns an efficient representation of the hyperparameter objective function using the Kalman filter, which is the optimal one-step predictor for a Linear Gaussian Dynamical System. To demonstrate the performance of HyperController, it is applied as a hyperparameter optimizer during training of reinforcement learning neural networks on a variety of OpenAI Gymnasium environments. In four out of the five Gymnasium environments, HyperController achieves highest median reward during evaluation compared to other algorithms. The results exhibit the potential of HyperController for efficient and stable training of reinforcement learning neural networks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes