Context-Aware Timewise VAEs for Real-Time Vehicle Trajectory Prediction
This work addresses the problem of accurate, multi-modal trajectory prediction for autonomous driving and traffic systems, offering incremental improvements through enhanced context integration.
The paper tackles real-time vehicle trajectory prediction by introducing ContextVAE, a context-aware method that integrates environmental and social features, achieving state-of-the-art performance on multiple datasets like nuScenes, Lyft Level 5, and Waymo Open Motion Dataset.
Real-time, accurate prediction of human steering behaviors has wide applications, from developing intelligent traffic systems to deploying autonomous driving systems in both real and simulated worlds. In this paper, we present ContextVAE, a context-aware approach for multi-modal vehicle trajectory prediction. Built upon the backbone architecture of a timewise variational autoencoder, ContextVAE observation encoding employs a dual attention mechanism that accounts for the environmental context and the dynamic agents' states, in a unified way. By utilizing features extracted from semantic maps during agent state encoding, our approach takes into account both the social features exhibited by agents on the scene and the physical environment constraints to generate map-compliant and socially-aware trajectories. We perform extensive testing on the nuScenes prediction challenge, Lyft Level 5 dataset and Waymo Open Motion Dataset to show the effectiveness of our approach and its state-of-the-art performance. In all tested datasets, ContextVAE models are fast to train and provide high-quality multi-modal predictions in real-time. Our code is available at: https://github.com/xupei0610/ContextVAE.