Coordinating Planning and Tracking in Layered Control Policies via Actor-Critic Learning
This work addresses the challenge of integrating planning and tracking for robotics or autonomous systems, offering a method to improve coordination, though it appears incremental as it builds on existing layered control and actor-critic frameworks.
The paper tackled the problem of coordinating planning and tracking in layered control policies by proposing a reinforcement learning algorithm that jointly trains a trajectory planner and tracking controller, achieving an interpretable policy and proving convergence to optimality in the LQR setting with empirical validation on a unicycle model.
We propose a reinforcement learning (RL)-based algorithm to jointly train (1) a trajectory planner and (2) a tracking controller in a layered control architecture. Our algorithm arises naturally from a rewrite of the underlying optimal control problem that lends itself to an actor-critic learning approach. By explicitly learning a \textit{dual} network to coordinate the interaction between the planning and tracking layers, we demonstrate the ability to achieve an effective consensus between the two components, leading to an interpretable policy. We theoretically prove that our algorithm converges to the optimal dual network in the Linear Quadratic Regulator (LQR) setting and empirically validate its applicability to nonlinear systems through simulation experiments on a unicycle model.