Inferring Objectives in Continuous Dynamic Games from Noise-Corrupted Partial State Observations
This work addresses the challenge of designing agent objectives in multi-agent systems like robotics, enabling downstream tasks such as trajectory forecasting, though it appears incremental as it builds on existing dynamic game theory frameworks.
The paper tackles the problem of inferring agent objectives in continuous dynamic games from noisy, partial state observations, proposing a method that jointly optimizes objectives and state estimates through Nash equilibrium constraints, and demonstrates reliable objective estimation and accurate trajectory predictions in simulated traffic scenarios.
Robots and autonomous systems must interact with one another and their environment to provide high-quality services to their users. Dynamic game theory provides an expressive theoretical framework for modeling scenarios involving multiple agents with differing objectives interacting over time. A core challenge when formulating a dynamic game is designing objectives for each agent that capture desired behavior. In this paper, we propose a method for inferring parametric objective models of multiple agents based on observed interactions. Our inverse game solver jointly optimizes player objectives and continuous-state estimates by coupling them through Nash equilibrium constraints. Hence, our method is able to directly maximize the observation likelihood rather than other non-probabilistic surrogate criteria. Our method does not require full observations of game states or player strategies to identify player objectives. Instead, it robustly recovers this information from noisy, partial state observations. As a byproduct of estimating player objectives, our method computes a Nash equilibrium trajectory corresponding to those objectives. Thus, it is suitable for downstream trajectory forecasting tasks. We demonstrate our method in several simulated traffic scenarios. Results show that it reliably estimates player objectives from a short sequence of noise-corrupted partial state observations. Furthermore, using the estimated objectives, our method makes accurate predictions of each player's trajectory.