RO LGAug 12, 2016

Density Matching Reward Learning

Sungjoon Choi, Kyungjae Lee, Andy Park, Songhwai Oh

arXiv:1608.03694v111.710 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of inferring reward functions from expert demonstrations, which is incremental as it builds on existing IRL methods with a density-based approach.

The authors tackled the problem of inverse reinforcement learning (IRL) by proposing DMRL, a model-free density-based algorithm, and extended it to KDMRL for nonlinear rewards, showing superior performance in grid world experiments and realistic driving tasks.

In this paper, we focus on the problem of inferring the underlying reward function of an expert given demonstrations, which is often referred to as inverse reinforcement learning (IRL). In particular, we propose a model-free density-based IRL algorithm, named density matching reward learning (DMRL), which does not require model dynamics. The performance of DMRL is analyzed theoretically and the sample complexity is derived. Furthermore, the proposed DMRL is extended to handle nonlinear IRL problems by assuming that the reward function is in the reproducing kernel Hilbert space (RKHS) and kernel DMRL (KDMRL) is proposed. The parameters for KDMRL can be computed analytically, which greatly reduces the computation time. The performance of KDMRL is extensively evaluated in two sets of experiments: grid world and track driving experiments. In grid world experiments, the proposed KDMRL method is compared with both model-based and model-free IRL methods and shows superior performance on a nonlinear reward setting and competitive performance on a linear reward setting in terms of expected value differences. Then we move on to more realistic experiments of learning different driving styles for autonomous navigation in complex and dynamic tracks using KDMRL and receding horizon control.

View on arXiv PDF

Similar