SY AI LGSep 8, 2025

Reinforcement learning meets bioprocess control through behaviour cloning: Real-world deployment in an industrial photobioreactor

Juan D. Gil, Ehecatl Antonio Del Rio Chanona, José L. Guzmán, Manuel Berenguel

arXiv:2509.06853v12.33 citationsh-index: 24Eng appl artif intell

Originality Incremental advance

AI Analysis

This addresses the problem of maintaining stable bioprocess conditions in industrial photobioreactors, representing an incremental advance with real-world deployment.

The paper tackled pH regulation in open photobioreactors by proposing a reinforcement learning control approach combined with behavior cloning, reducing the Integral of Absolute Error by 8% compared to PID control and decreasing control effort by 54%.

The inherent complexity of living cells as production units creates major challenges for maintaining stable and optimal bioprocess conditions, especially in open Photobioreactors (PBRs) exposed to fluctuating environments. To address this, we propose a Reinforcement Learning (RL) control approach, combined with Behavior Cloning (BC), for pH regulation in open PBR systems. This represents, to the best of our knowledge, the first application of an RL-based control strategy to such a nonlinear and disturbance-prone bioprocess. Our method begins with an offline training stage in which the RL agent learns from trajectories generated by a nominal Proportional-Integral-Derivative (PID) controller, without direct interaction with the real system. This is followed by a daily online fine-tuning phase, enabling adaptation to evolving process dynamics and stronger rejection of fast, transient disturbances. This hybrid offline-online strategy allows deployment of an adaptive control policy capable of handling the inherent nonlinearities and external perturbations in open PBRs. Simulation studies highlight the advantages of our method: the Integral of Absolute Error (IAE) was reduced by 8% compared to PID control and by 5% relative to standard off-policy RL. Moreover, control effort decreased substantially-by 54% compared to PID and 7% compared to standard RL-an important factor for minimizing operational costs. Finally, an 8-day experimental validation under varying environmental conditions confirmed the robustness and reliability of the proposed approach. Overall, this work demonstrates the potential of RL-based methods for bioprocess control and paves the way for their broader application to other nonlinear, disturbance-prone systems.

View on arXiv PDF

Similar