RMP: A Random Mask Pretrain Framework for Motion Prediction
This work addresses motion prediction for autonomous driving systems, offering an incremental improvement by adapting existing random mask techniques from NLP and CV to this domain.
The paper tackles the lack of pretraining methods for motion prediction in autonomous driving by proposing a random mask pretraining framework that masks object positions at random timesteps and fills them in using a neural network. The framework improves motion prediction accuracy and reduces miss rates, particularly for occluded objects, as demonstrated on Argoverse and NuScenes datasets.
As the pretraining technique is growing in popularity, little work has been done on pretrained learning-based motion prediction methods in autonomous driving. In this paper, we propose a framework to formalize the pretraining task for trajectory prediction of traffic participants. Within our framework, inspired by the random masked model in natural language processing (NLP) and computer vision (CV), objects' positions at random timesteps are masked and then filled in by the learned neural network (NN). By changing the mask profile, our framework can easily switch among a range of motion-related tasks. We show that our proposed pretraining framework is able to deal with noisy inputs and improves the motion prediction accuracy and miss rate, especially for objects occluded over time by evaluating it on Argoverse and NuScenes datasets.