LGMar 10

Impact of Markov Decision Process Design on Sim-to-Real Reinforcement Learning

Tatjana Krau, Jorge Mandlmaier, Tobias Damm, Frieder Heieck

arXiv:2603.09427v24.7h-index: 17

Predicted impact top 90% in LG · last 90 daysOriginality Incremental advance

AI Analysis

This provides practical guidelines for deploying RL in industrial process control, but it is incremental as it focuses on specific design choices rather than a breakthrough.

The paper tackles the sim-to-real gap in reinforcement learning for industrial process control by analyzing how Markov Decision Process design choices affect transfer, finding that physics-based dynamics models achieve up to 50% real-world success in a color mixing task where simplified models fail.

Reinforcement Learning (RL) has demonstrated strong potential for industrial process control, yet policies trained in simulation often suffer from a significant sim-to-real gap when deployed on physical hardware. This work systematically analyzes how core Markov Decision Process (MDP) design choices -- state composition, target inclusion, reward formulation, termination criteria, and environment dynamics models -- affect this transfer. Using a color mixing task, we evaluate different MDP configurations and mixing dynamics across simulation and real-world experiments. We validate our findings on physical hardware, demonstrating that physics-based dynamics models achieve up to 50% real-world success under strict precision constraints where simplified models fail entirely. Our results provide practical MDP design guidelines for deploying RL in industrial process control.

View on arXiv PDF

Similar