CVMar 20, 2023

Leapfrog Diffusion Model for Stochastic Trajectory Prediction

Weibo Mao, Chenxin Xu, Qi Zhu, Siheng Chen, Yanfeng Wang

Berkeley

arXiv:2303.10895v132.2239 citationsh-index: 46Has Code

Originality Incremental advance

AI Analysis

This addresses the real-time prediction needs for human trajectory modeling in applications like sports analytics and surveillance, though it is incremental as it builds on existing diffusion models.

The paper tackles the problem of slow inference in diffusion models for stochastic trajectory prediction by introducing a leapfrog initializer that skips denoising steps, achieving real-time predictions with improved accuracy and diversity. It shows 23.7%/21.9% ADE/FDE improvement on NFL and speeds up inference by up to 30.8 times on various datasets.

To model the indeterminacy of human behaviors, stochastic trajectory prediction requires a sophisticated multi-modal distribution of future trajectories. Emerging diffusion models have revealed their tremendous representation capacities in numerous generation tasks, showing potential for stochastic trajectory prediction. However, expensive time consumption prevents diffusion models from real-time prediction, since a large number of denoising steps are required to assure sufficient representation ability. To resolve the dilemma, we present LEapfrog Diffusion model (LED), a novel diffusion-based trajectory prediction model, which provides real-time, precise, and diverse predictions. The core of the proposed LED is to leverage a trainable leapfrog initializer to directly learn an expressive multi-modal distribution of future trajectories, which skips a large number of denoising steps, significantly accelerating inference speed. Moreover, the leapfrog initializer is trained to appropriately allocate correlated samples to provide a diversity of predicted future trajectories, significantly improving prediction performances. Extensive experiments on four real-world datasets, including NBA/NFL/SDD/ETH-UCY, show that LED consistently improves performance and achieves 23.7%/21.9% ADE/FDE improvement on NFL. The proposed LED also speeds up the inference 19.3/30.8/24.3/25.1 times compared to the standard diffusion model on NBA/NFL/SDD/ETH-UCY, satisfying real-time inference needs. Code is available at https://github.com/MediaBrain-SJTU/LED.

View on arXiv PDF Code

Similar