Online inverse reinforcement learning for nonlinear systems
Provides a method for learning reward functions in nonlinear systems with unknown dynamics, but results are only simulated and incremental over existing IRL approaches.
Developed an online inverse reinforcement learning method for nonlinear systems that estimates unknown cost and value functions from observed trajectories, achieving convergence of reward weights and dynamics in simulations.
This paper focuses on the development of an online inverse reinforcement learning (IRL) technique for a class of nonlinear systems. The developed approach utilizes observed state and input trajectories, and determines the unknown cost function and the unknown value function online. A parameter estimation technique is utilized to allow the developed IRL technique to determine the cost function weights in the presence of unknown dynamics. Simulation results are presented for a nonlinear system showing convergence of both unknown reward function weights and unknown dynamics.