Toward Efficient and Robust Behavior Models for Multi-Agent Driving Simulation
This work addresses the problem of scalable and realistic simulation for autonomous driving systems, representing an incremental improvement through novel architectural and training optimizations.
The paper tackles the challenge of creating efficient and realistic behavior models for multi-agent driving simulation by optimizing individual traffic participant control with an instance-centric scene representation and query-centric symmetric context encoder, achieving significant reductions in training and inference times while improving positional accuracy and robustness over baselines.
Scalable multi-agent driving simulation requires behavior models that are both realistic and computationally efficient. We address this by optimizing the behavior model that controls individual traffic participants. To improve efficiency, we adopt an instance-centric scene representation, where each traffic participant and map element is modeled in its own local coordinate frame. This design enables efficient, viewpoint-invariant scene encoding and allows static map tokens to be reused across simulation steps. To model interactions, we employ a query-centric symmetric context encoder with relative positional encodings between local frames. We use Adversarial Inverse Reinforcement Learning to learn the behavior model and propose an adaptive reward transformation that automatically balances robustness and realism during training. Experiments demonstrate that our approach scales efficiently with the number of tokens, significantly reducing training and inference times, while outperforming several agent-centric baselines in terms of positional accuracy and robustness.