Inverse Reinforcement Learning in a Continuous State Space with Formal Guarantees
This work addresses a challenging problem in automated control for scenarios where reward functions are hard to specify manually, though it appears incremental as it builds on existing IRL methods.
The authors tackled the problem of inverse reinforcement learning in continuous state spaces with unknown transition dynamics by developing a new algorithm using orthonormal basis functions, and they provided formal guarantees on correctness, sample complexity, and time complexity, supported by synthetic experiments.
Inverse Reinforcement Learning (IRL) is the problem of finding a reward function which describes observed/known expert behavior. The IRL setting is remarkably useful for automated control, in situations where the reward function is difficult to specify manually or as a means to extract agent preference. In this work, we provide a new IRL algorithm for the continuous state space setting with unknown transition dynamics by modeling the system using a basis of orthonormal functions. Moreover, we provide a proof of correctness and formal guarantees on the sample and time complexity of our algorithm. Finally, we present synthetic experiments to corroborate our theoretical guarantees.