Newton's Lantern: A Reinforcement Learning Framework for Finetuning AC Power Flow Warm Start Models

Shourya Bose, Helgi Hilmarsson, Dhruv Suri

arXiv:2605.1110243.0

AI Analysis

For power system operators, this method provides a reliable warm start that converges even under heavy loading, addressing a critical failure mode of existing supervised methods.

Newton's Lantern is a reinforcement learning framework that finetunes neural warm start models for AC power flow, achieving convergence on all test snapshots and the smallest mean iteration count across IEEE 118-bus, GOC 500-bus, and GOC 2000-bus benchmarks, outperforming supervised approaches that fail near voltage collapse.

Neural warm starts can sharply reduce the number of Newton-Raphson iterations required to solve the AC power flow problem, but existing supervised approaches generalize poorly on heavily loaded instances near voltage collapse. We prove a lower bound on the Newton-Raphson iteration count that depends on the direction of the warm start error rather than on its magnitude, and show as a corollary that the bound becomes vacuous as the smallest singular value of the power-flow Jacobian shrinks, identifying the failure mode of supervised regression near the saddle-node bifurcation. Motivated by this analysis, we introduce Newton's Lantern, a finetuning pipeline that combines group relative policy optimization with a learned reward model trained on perturbations of the base model's predictions, using the iteration count itself as the supervisory signal. Across IEEE 118-bus, GOC 500-bus, and GOC 2000-bus benchmarks, Newton's Lantern is the only method that converges on every test snapshot while attaining the smallest mean iteration count.

View on arXiv PDF

Similar