To Learn or Not to Learn: A Litmus Test for Using Reinforcement Learning in Control
For control engineers deciding between model-based and RL-based control, this test provides a principled, low-cost way to avoid unnecessary RL training when it is unlikely to be beneficial.
This paper introduces a computationally efficient, simulation-based litmus test to predict whether reinforcement learning (RL) will outperform model-based control for a given control problem, without requiring RL training. The test evaluates model uncertainty impact and learnability, and is demonstrated on several benchmarks to show potential computational savings.
Reinforcement learning (RL) can be a powerful alternative to classical control methods when standard model-based control is insufficient, e.g., when deriving a suitable model is intractable or impossible. In many cases, however, the choice between model-based and RL-based control is not obvious. Due to the high computational costs of training RL agents, RL-based control should be limited to cases where it is expected to yield superior results compared to model-based control. To the best of our knowledge, there exists no approach to quantify the benefit of RL-based control that does not require RL training. In this work, we present a computationally efficient, purely simulation-based litmus test predicting whether RL-based control is superior to model-based control. Our test evaluates the suitability of the given model for model-based control by analyzing the impact of model uncertainties on the control problem. For this, we use reachset-conformant model identification combined with simulation-based analysis. This is followed by a learnability evaluation of the uncertainties based on correlation analysis. This two-part analysis enables an informed decision on the suitability of RL for a control problem without training an RL agent. We apply our test to several benchmarks, demonstrating its applicability to a wide range of control problems and highlight the potential to save computational resources.