LGSYSYApr 2

Physics Informed Reinforcement Learning with Gibbs Priors for Topology Control in Power Grids

arXiv:2604.0183044.0h-index: 4
AI Analysis

This work addresses the problem of efficient and effective topology control for power grid operators, offering a domain-specific incremental improvement by integrating physics constraints into reinforcement learning.

The paper tackles the challenging sequential decision-making problem of topology control in power grids by proposing a physics-informed reinforcement learning framework with Gibbs priors, achieving results such as matching oracle-level performance while being approximately 6x faster on one benchmark and improving over a PPO baseline by up to 255% in reward on another.

Topology control for power grid operation is a challenging sequential decision making problem because the action space grows combinatorially with the size of the grid and action evaluation through simulation is computationally expensive. We propose a physics-informed Reinforcement Learning framework that combines semi-Markov control with a Gibbs prior, that encodes the system's physics, over the action space. The decision is only taken when the grid enters a hazardous regime, while a graph neural network surrogate predicts the post action overload risk of feasible topology actions. These predictions are used to construct a physics-informed Gibbs prior that both selects a small state-dependent candidate set and reweights policy logits before action selection. In this way, our method reduces exploration difficulty and online simulation cost while preserving the flexibility of a learned policy. We evaluate the approach in three realistic benchmark environments of increasing difficulty. Across all settings, the proposed method achieves a strong balance between control quality and computational efficiency: it matches oracle-level performance while being approximately $6\times$ faster on the first benchmark, reaches $94.6\%$ of oracle reward with roughly $200\times$ lower decision time on the second one, and on the most challenging benchmark improves over a PPO baseline by up to $255\%$ in reward and $284\%$ in survived steps while remaining about $2.5\times$ faster than a strong specialized engineering baseline. These results show that our method provides an effective mechanism for topology control in power grids.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes