Iterative Learning Control-Informed Reinforcement Learning for Batch Process Control

Runze Lin, Ziqi Zhuo, Junghui Chen, Lei Xie, Hongye Su

arXiv:2603.1518024.5h-index: 13

AI Analysis

This addresses safety and adoption barriers for DRL in industrial batch process control, representing an incremental improvement by integrating established control methods.

The paper tackles the safety and stability issues of Deep Reinforcement Learning (DRL) in industrial process control by proposing an Iterative Learning Control-Informed Reinforcement Learning (IL-CIRL) framework, which incorporates Kalman filter-based state estimation to guide DRL agents toward stable and constraint-satisfying control policies for batch processes under disturbances.

A significant limitation of Deep Reinforcement Learning (DRL) is the stochastic uncertainty in actions generated during exploration-exploitation, which poses substantial safety risks during both training and deployment. In industrial process control, the lack of formal stability and convergence guarantees further inhibits adoption of DRL methods by practitioners. Conversely, Iterative Learning Control (ILC) represents a well-established autonomous control methodology for repetitive systems, particularly in batch process optimization. ILC achieves desired control performance through iterative refinement of control laws, either between consecutive batches or within individual batches, to compensate for both repetitive and non-repetitive disturbances. This study introduces an Iterative Learning Control-Informed Reinforcement Learning (IL-CIRL) framework for training DRL controllers in dual-layer batch-to-batch and within-batch control architectures for batch processes. The proposed method incorporates Kalman filter-based state estimation within the iterative learning structure to guide DRL agents toward control policies that satisfy operational constraints and ensure stability guarantees. This approach enables the systematic design of DRL controllers for batch processes operating under multiple disturbance conditions.

View on arXiv PDF

Similar