Solution Concepts in Hierarchical Games under Bounded Rationality with Applications to Autonomous Driving
This work addresses the need for more realistic game-theoretic models of human driving behavior to improve autonomous vehicle integration, though it is incremental as it adapts existing bounded rational concepts to a specific domain.
The paper tackled the problem of modeling human driving behavior as bounded rational in multi-agent motion planning for autonomous vehicles, by adapting and evaluating four bounded rational behavior models on a dataset of human driving at an urban intersection. The results showed that a Quantal level-k model with rule-following level-0 behavior provided the best fit at the maneuver level, while bounds sampling and maxmax models were most accurate at the trajectory level, with situational factors significantly affecting performance.
With autonomous vehicles (AV) set to integrate further into regular human traffic, there is an increasing consensus on treating AV motion planning as a multi-agent problem. However, the traditional game-theoretic assumption of complete rationality is too strong for human driving, and there is a need for understanding human driving as a \emph{bounded rational} activity through a behavioural game-theoretic lens. To that end, we adapt four metamodels of bounded rational behaviour: three based on Quantal level-k and one based on Nash equilibrium with quantal errors. We formalize the different solution concepts that can be applied in the context of hierarchical games, a framework used in multi-agent motion planning, for the purpose of creating game theoretic models of driving behaviour. Furthermore, based on a contributed dataset of human driving at a busy urban intersection with a total of approximately 4k agents and 44k decision points, we evaluate the behaviour models on the basis of model fit to naturalistic data, as well as their predictive capacity. Our results suggest that among the behaviour models evaluated, at the level of maneuvers, modeling driving behaviour as an adaptation of the Quantal level-k model with level-0 behaviour modelled as pure rule-following provides the best fit to naturalistic driving behaviour. At the level of trajectories, bounds sampling of actions and a maxmax non-strategic models is the most accurate within the set of models in comparison. We also find a significant impact of situational factors on the performance of behaviour models.