Robust Human Trajectory Prediction via Self-Supervised Skeleton Representation Learning
This work is significant for autonomous navigation and video surveillance systems that rely on human trajectory prediction, especially in environments with occlusions leading to incomplete skeleton data.
This paper addresses the problem of human trajectory prediction in real-world scenarios where skeleton data often suffer from missing joints due to occlusions. The authors propose a method that uses a self-supervised skeleton representation model pretrained with masked autoencoding, which improves robustness to missing skeletal data and consistently outperforms baseline models in clean-to-moderate missingness regimes.
Human trajectory prediction plays a crucial role in applications such as autonomous navigation and video surveillance. While recent works have explored the integration of human skeleton sequences to complement trajectory information, skeleton data in real-world environments often suffer from missing joints caused by occlusions. These disturbances significantly degrade prediction accuracy, indicating the need for more robust skeleton representations. We propose a robust trajectory prediction method that incorporates a self-supervised skeleton representation model pretrained with masked autoencoding. Experimental results in occlusion-prone scenarios show that our method improves robustness to missing skeletal data without sacrificing prediction accuracy, and consistently outperforms baseline models in clean-to-moderate missingness regimes.