How Training Data Impacts Performance in Learning-based Control
This work addresses the lack of measures for assessing training data sets in learning-based control, which is crucial for improving control law performance in complex systems where first-principle models are unavailable.
The paper tackles the problem of how training data quality affects learning-based control performance by deriving an analytical relationship between data density and control performance, introducing a quality measure called $\rho$-gap, and deriving an ultimate bound for tracking error considering model uncertainty.
When first principle models cannot be derived due to the complexity of the real system, data-driven methods allow us to build models from system observations. As these models are employed in learning-based control, the quality of the data plays a crucial role for the performance of the resulting control law. Nevertheless, there hardly exist measures for assessing training data sets, and the impact of the distribution of the data on the closed-loop system properties is largely unknown. This paper derives - based on Gaussian process models - an analytical relationship between the density of the training data and the control performance. We formulate a quality measure for the data set, which we refer to as $ρ$-gap, and derive the ultimate bound for the tracking error under consideration of the model uncertainty. We show how the $ρ$-gap can be applied to a feedback linearizing control law and provide numerical illustrations for our approach.