Online Observer-Based Inverse Reinforcement Learning
This addresses the inverse reinforcement learning problem for researchers in control and machine learning, but it is incremental as it builds on existing observer methods.
The paper tackled the output-feedback inverse reinforcement learning problem for linear systems with quadratic costs by framing it as a state estimation problem, developing two observer-based techniques including one using history stacks, and demonstrated performance in simulations under noisy and noise-free conditions.
In this paper, a novel approach to the output-feedback inverse reinforcement learning (IRL) problem is developed by casting the IRL problem, for linear systems with quadratic cost functions, as a state estimation problem. Two observer-based techniques for IRL are developed, including a novel observer method that re-uses previous state estimates via history stacks. Theoretical guarantees for convergence and robustness are established under appropriate excitation conditions. Simulations demonstrate the performance of the developed observers and filters under noisy and noise-free measurements.