SYLGJun 12, 2019

Adaptive Optimal Control for Reference Tracking Independent of Exo-System Dynamics

arXiv:1906.05085v510 citations
AI Analysis

This addresses a key limitation in reinforcement learning-based control for applications requiring flexible trajectory tracking, such as autonomous driving, representing a novel advancement rather than an incremental improvement.

The paper tackles the problem of model-free control methods being limited to tracking references generated by time-invariant exo-systems, which restricts applications like autonomous driving, and introduces an adaptive optimal control method that can track arbitrary reference trajectories, demonstrating effectiveness in real-world examples.

Model-free control based on the idea of Reinforcement Learning is a promising approach that has recently gained extensive attention. However, Reinforcement-Learning-based control methods solely focus on the regulation problem or learn to track a reference that is generated by a time-invariant exo-system. In the latter case, controllers are only able to track the time-invariant reference dynamics which they have been trained on and need to be re-trained each time the reference dynamics change. Consequently, these methods fail in a number of applications which obviously rely on a trajectory not being generated by an exo-system. One prominent example is autonomous driving. This paper provides for the first time an adaptive optimal control method capable to track reference trajectories not being generated by a time-invariant exo-system. The main innovation is a novel Q-function that directly incorporates a given reference trajectory on a moving horizon. This new Q-function exhibits a particular structure which allows the design of an efficient, iterative, provably convergent Reinforcement Learning algorithm that enables optimal tracking. Two real-world examples demonstrate the effectiveness of our new method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes