CV AISep 2, 2021

TrouSPI-Net: Spatio-temporal attention on parallel atrous convolutions and U-GRUs for skeletal pedestrian crossing prediction

Joseph Gesnouin, Steve Pechberti, Bogdan Stanciulescu, Fabien Moutarde

arXiv:2109.00953v28.740 citations

Originality Incremental advance

AI Analysis

This addresses pedestrian safety for autonomous vehicles, with incremental improvements in prediction accuracy.

The paper tackles pedestrian crossing prediction by linking skeletal dynamics to crossing intention, achieving F1 scores of 0.76 on JAAD and 0.80 on PIE, outperforming state-of-the-art methods.

Understanding the behaviors and intentions of pedestrians is still one of the main challenges for vehicle autonomy, as accurate predictions of their intentions can guarantee their safety and driving comfort of vehicles. In this paper, we address pedestrian crossing prediction in urban traffic environments by linking the dynamics of a pedestrian's skeleton to a binary crossing intention. We introduce TrouSPI-Net: a context-free, lightweight, multi-branch predictor. TrouSPI-Net extracts spatio-temporal features for different time resolutions by encoding pseudo-images sequences of skeletal joints' positions and processes them with parallel attention modules and atrous convolutions. The proposed approach is then enhanced by processing features such as relative distances of skeletal joints, bounding box positions, or ego-vehicle speed with U-GRUs. Using the newly proposed evaluation procedures for two large public naturalistic data sets for studying pedestrian behavior in traffic: JAAD and PIE, we evaluate TrouSPI-Net and analyze its performance. Experimental results show that TrouSPI-Net achieved 0.76 F1 score on JAAD and 0.80 F1 score on PIE, therefore outperforming current state-of-the-art while being lightweight and context-free.

View on arXiv PDF

Similar