Evolutionary Warm-Starts for Reinforcement Learning in Industrial Continuous Control
For researchers applying RL to industrial control, this paper demonstrates a method to improve training stability and performance, but it is an incremental contribution on a simple benchmark.
This work shows that using CMA-ES to warm-start RL agents significantly improves stability and performance on an industrial sorting benchmark, providing a proof of concept for hybrid evolutionary-RL approaches.
Reinforcement learning (RL) is still rarely applied in industrial control, partly due to the difficulty of training reliable agents for real-world conditions. This work investigates how evolution strategies can support RL in such settings by introducing a continuous-control adaptation of an industrial sorting benchmark. The CMA-ES algorithm is used to generate high-quality demonstrations that warm-start RL agents. Results show that CMA-ES-guided initialization significantly improves stability and performance. Furthermore, the demonstration trajectories generated with the CMA-ES provide a strong oracle reference performance level, which is of interest in its own right. The study delivers a focused proof of concept for hybrid evolutionary-RL approaches and a basis for future, more complex industrial applications.