Expressing Diverse Human Driving Behavior with Probabilistic Rewards and Online Inference
This addresses the challenge of representing varied human driving styles for human-robot interaction systems, but it is incremental as it builds on existing IRL methods.
The paper tackles the problem of capturing diverse human driving behavior in autonomous vehicles by proposing a probabilistic inverse reinforcement learning framework that learns a distribution of cost functions. The results show it better expresses diverse behaviors and extracts driving styles matching human interpretations in evaluations on synthetic and real data.
In human-robot interaction (HRI) systems, such as autonomous vehicles, understanding and representing human behavior are important. Human behavior is naturally rich and diverse. Cost/reward learning, as an efficient way to learn and represent human behavior, has been successfully applied in many domains. Most of traditional inverse reinforcement learning (IRL) algorithms, however, cannot adequately capture the diversity of human behavior since they assume that all behavior in a given dataset is generated by a single cost function.In this paper, we propose a probabilistic IRL framework that directly learns a distribution of cost functions in continuous domain. Evaluations on both synthetic data and real human driving data are conducted. Both the quantitative and subjective results show that our proposed framework can better express diverse human driving behaviors, as well as extracting different driving styles that match what human participants interpret in our user study.