End-to-End 3D Multi-Object Tracking and Trajectory Forecasting
This work addresses the need for integrated tracking and forecasting in autonomous driving, though it is incremental as it builds on existing tasks with novel computational units.
The paper tackles the problem of unifying 3D multi-object tracking and trajectory forecasting in perception systems by proposing a framework with Graph Neural Networks for agent interaction and a diversity sampling function for trajectory generation, achieving state-of-the-art performance on the KITTI dataset.
3D multi-object tracking (MOT) and trajectory forecasting are two critical components in modern 3D perception systems. We hypothesize that it is beneficial to unify both tasks under one framework to learn a shared feature representation of agent interaction. To evaluate this hypothesis, we propose a unified solution for 3D MOT and trajectory forecasting which also incorporates two additional novel computational units. First, we employ a feature interaction technique by introducing Graph Neural Networks (GNNs) to capture the way in which multiple agents interact with one another. The GNN is able to model complex hierarchical interactions, improve the discriminative feature learning for MOT association, and provide socially-aware context for trajectory forecasting. Second, we use a diversity sampling function to improve the quality and diversity of our forecasted trajectories. The learned sampling function is trained to efficiently extract a variety of outcomes from a generative trajectory distribution and helps avoid the problem of generating many duplicate trajectory samples. We show that our method achieves state-of-the-art performance on the KITTI dataset. Our project website is at http://www.xinshuoweng.com/projects/GNNTrkForecast.