MultiXNet: Multiclass Multistage Multimodal Motion Prediction
This addresses the problem of accurate motion prediction for self-driving vehicles, which is incremental by building on prior work with added features like multimodal outputs and refinement.
The paper tackles motion prediction for self-driving vehicles by proposing MultiXNet, an end-to-end approach that directly uses lidar data to detect and predict future motion of multiple traffic actor classes, including multimodal distributions with trajectory refinement. The results show that it outperforms existing state-of-the-art methods on large-scale real-world data.
One of the critical pieces of the self-driving puzzle is understanding the surroundings of a self-driving vehicle (SDV) and predicting how these surroundings will change in the near future. To address this task we propose MultiXNet, an end-to-end approach for detection and motion prediction based directly on lidar sensor data. This approach builds on prior work by handling multiple classes of traffic actors, adding a jointly trained second-stage trajectory refinement step, and producing a multimodal probability distribution over future actor motion that includes both multiple discrete traffic behaviors and calibrated continuous position uncertainties. The method was evaluated on large-scale, real-world data collected by a fleet of SDVs in several cities, with the results indicating that it outperforms existing state-of-the-art approaches.