Distributional Reinforcement Learning for Scheduling of Chemical Production Processes
This addresses scheduling challenges for chemical production industries by enabling faster, risk-aware online decisions, though it is incremental as it adapts RL to a specific domain with existing constraints.
The paper tackles production scheduling under uncertainty in chemical processes by proposing a distributional reinforcement learning method that handles constraints and optimizes risk-sensitive measures like CVaR, achieving comparable expected performance to MILP methods with orders of magnitude faster online decision-making.
Reinforcement Learning (RL) has recently received significant attention from the process systems engineering and control communities. Recent works have investigated the application of RL to identify optimal scheduling decision in the presence of uncertainty. In this work, we present a RL methodology tailored to efficiently address production scheduling problems in the presence of uncertainty. We consider commonly imposed restrictions on these problems such as precedence and disjunctive constraints which are not naturally considered by RL in other contexts. Additionally, this work naturally enables the optimization of risk-sensitive formulations such as the conditional value-at-risk (CVaR), which are essential in realistic scheduling processes. The proposed strategy is investigated thoroughly in a parallel batch production environment, and benchmarked against mixed integer linear programming (MILP) strategies. We show that the policy identified by our approach is able to account for plant uncertainties in online decision-making, with expected performance comparable to existing MILP methods. Additionally, the framework gains the benefits of optimizing for risk-sensitive measures, and identifies online decisions orders of magnitude faster than the most efficient optimization approaches. This promises to mitigate practical issues and ease in handling realizations of process uncertainty in the paradigm of online production scheduling.