SY LGJan 23, 2024

A Safe Reinforcement Learning Algorithm for Supervisory Control of Power Plants

Yixuan Sun, Sami Khairy, Richard B. Vilim, Rui Hu, Akshay J. Dave

arXiv:2401.13020v13.310 citationsh-index: 12Knowledge-Based Systems

Originality Incremental advance

AI Analysis

This addresses the problem of safe control in power plants for engineers and operators, but it is incremental as it builds on existing RL methods with a specific constraint-handling technique.

The paper tackles the challenge of imposing state constraints in power plant control using reinforcement learning by proposing a chance-constrained RL algorithm based on Proximal Policy Optimization, which achieves the smallest distance of violation and violation rate in load-follow maneuvers for an advanced nuclear power plant design.

Traditional control theory-based methods require tailored engineering for each system and constant fine-tuning. In power plant control, one often needs to obtain a precise representation of the system dynamics and carefully design the control scheme accordingly. Model-free Reinforcement learning (RL) has emerged as a promising solution for control tasks due to its ability to learn from trial-and-error interactions with the environment. It eliminates the need for explicitly modeling the environment's dynamics, which is potentially inaccurate. However, the direct imposition of state constraints in power plant control raises challenges for standard RL methods. To address this, we propose a chance-constrained RL algorithm based on Proximal Policy Optimization for supervisory control. Our method employs Lagrangian relaxation to convert the constrained optimization problem into an unconstrained objective, where trainable Lagrange multipliers enforce the state constraints. Our approach achieves the smallest distance of violation and violation rate in a load-follow maneuver for an advanced Nuclear Power Plant design.

View on arXiv PDF

Similar