CVAISep 2, 2021

TrouSPI-Net: Spatio-temporal attention on parallel atrous convolutions and U-GRUs for skeletal pedestrian crossing prediction

arXiv:2109.00953v240 citations
Originality Incremental advance
AI Analysis

This addresses pedestrian safety for autonomous vehicles, with incremental improvements in prediction accuracy.

The paper tackles pedestrian crossing prediction by linking skeletal dynamics to crossing intention, achieving F1 scores of 0.76 on JAAD and 0.80 on PIE, outperforming state-of-the-art methods.

Understanding the behaviors and intentions of pedestrians is still one of the main challenges for vehicle autonomy, as accurate predictions of their intentions can guarantee their safety and driving comfort of vehicles. In this paper, we address pedestrian crossing prediction in urban traffic environments by linking the dynamics of a pedestrian's skeleton to a binary crossing intention. We introduce TrouSPI-Net: a context-free, lightweight, multi-branch predictor. TrouSPI-Net extracts spatio-temporal features for different time resolutions by encoding pseudo-images sequences of skeletal joints' positions and processes them with parallel attention modules and atrous convolutions. The proposed approach is then enhanced by processing features such as relative distances of skeletal joints, bounding box positions, or ego-vehicle speed with U-GRUs. Using the newly proposed evaluation procedures for two large public naturalistic data sets for studying pedestrian behavior in traffic: JAAD and PIE, we evaluate TrouSPI-Net and analyze its performance. Experimental results show that TrouSPI-Net achieved 0.76 F1 score on JAAD and 0.80 F1 score on PIE, therefore outperforming current state-of-the-art while being lightweight and context-free.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes