RL2Grid: Benchmarking Reinforcement Learning in Power Grid Operations
This work addresses the problem of developing adaptive controllers for power grid decarbonization, but it is incremental as it focuses on benchmarking rather than proposing new methods.
The authors tackled the challenge of applying reinforcement learning to power grid operations by introducing RL2Grid, a benchmark developed with power system operators, which standardizes evaluation and shows that existing RL methods need improvement to handle real-world complexities.
Reinforcement learning (RL) can provide adaptive and scalable controllers essential for power grid decarbonization. However, RL methods struggle with power grids' complex dynamics, long-horizon goals, and hard physical constraints. For these reasons, we present RL2Grid, a benchmark designed in collaboration with power system operators to accelerate progress in grid control and foster RL maturity. Built on RTE France's power simulation framework, RL2Grid standardizes tasks, state and action spaces, and reward structures for a systematic evaluation and comparison of RL algorithms. Moreover, we integrate operational heuristics and design safety constraints based on human expertise to ensure alignment with physical requirements. By establishing reference performance metrics for classic RL baselines on RL2Grid's tasks, we highlight the need for novel methods capable of handling real systems and discuss future directions for RL-based grid control.