CVAILGROSYApr 17, 2022

ParkPredict+: Multimodal Intent and Motion Prediction for Vehicles in Parking Lots with CNN and Transformer

arXiv:2204.10777v225 citationsh-index: 75
Originality Incremental advance
AI Analysis

This addresses the problem of predicting vehicle behavior in complex parking environments for autonomous driving systems, representing an incremental advancement with a new dataset.

The paper tackles multimodal intent and trajectory prediction for human-driven vehicles in parking lots using CNN and Transformer networks, achieving improved accuracy over existing models while handling arbitrary modes, multi-agent scenarios, and different parking maps. It also introduces the first public 4K video dataset of parking lot driving with accurate annotations, high frame rate, and rich traffic scenarios.

The problem of multimodal intent and trajectory prediction for human-driven vehicles in parking lots is addressed in this paper. Using models designed with CNN and Transformer networks, we extract temporal-spatial and contextual information from trajectory history and local bird's eye view (BEV) semantic images, and generate predictions about intent distribution and future trajectory sequences. Our methods outperform existing models in accuracy, while allowing an arbitrary number of modes, encoding complex multi-agent scenarios, and adapting to different parking maps. To train and evaluate our method, we present the first public 4K video dataset of human driving in parking lots with accurate annotation, high frame rate, and rich traffic scenarios.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes