CVRODec 9, 2024

PPT: Pretraining with Pseudo-Labeled Trajectories for Motion Forecasting

arXiv:2412.06491v23 citationsh-index: 29
Originality Incremental advance
AI Analysis

This addresses the scalability and generalization issues in motion forecasting for autonomous driving, though it appears to be an incremental improvement over existing pretraining approaches.

The paper tackles the problem of costly and limited manually annotated datasets for motion forecasting in autonomous driving by introducing PPT, a pretraining method that uses automatically generated pseudo-labeled trajectories. Models pretrained with PPT achieve strong performance across standard benchmarks, particularly in low-data regimes and cross-domain settings.

Accurately predicting how agents move in dynamic scenes is essential for safe autonomous driving. State-of-the-art motion forecasting models rely on large curated datasets with manually annotated or heavily post-processed trajectories. However, building these datasets is costly, generally manual, hard to scale, and lacks reproducibility. They also introduce domain gaps that limit generalization across environments. We introduce PPT (Pretraining with Pseudo-labeled Trajectories), a simple and scalable alternative that uses unprocessed and diverse trajectories automatically generated from off-the-shelf 3D detectors and tracking. Unlike traditional pipelines aiming for clean, single-label annotations, PPT embraces noise and diversity as useful signals for learning robust representations. With optional finetuning on a small amount of labeled data, models pretrained with PPT achieve strong performance across standard benchmarks particularly in low-data regimes, and in cross-domain, end-to-end and multi-class settings. PPT is easy to implement and improves generalization in motion forecasting. Code and data will be released upon acceptance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes