CVLGAug 27, 2025

TrajFusionNet: Pedestrian Crossing Intention Prediction via Fusion of Sequential and Visual Trajectory Representations

arXiv:2508.19866v1h-index: 38
Originality Incremental advance
AI Analysis

This work addresses a critical safety problem for autonomous vehicles by improving pedestrian intention prediction, though it appears incremental as it builds on existing transformer and attention methods.

The paper tackles pedestrian crossing intention prediction for autonomous vehicles by proposing TrajFusionNet, a transformer-based model that fuses sequential and visual trajectory representations, achieving state-of-the-art results on three common datasets and the lowest total inference time.

With the introduction of vehicles with autonomous capabilities on public roads, predicting pedestrian crossing intention has emerged as an active area of research. The task of predicting pedestrian crossing intention involves determining whether pedestrians in the scene are likely to cross the road or not. In this work, we propose TrajFusionNet, a novel transformer-based model that combines future pedestrian trajectory and vehicle speed predictions as priors for predicting crossing intention. TrajFusionNet comprises two branches: a Sequence Attention Module (SAM) and a Visual Attention Module (VAM). The SAM branch learns from a sequential representation of the observed and predicted pedestrian trajectory and vehicle speed. Complementarily, the VAM branch enables learning from a visual representation of the predicted pedestrian trajectory by overlaying predicted pedestrian bounding boxes onto scene images. By utilizing a small number of lightweight modalities, TrajFusionNet achieves the lowest total inference time (including model runtime and data preprocessing) among current state-of-the-art approaches. In terms of performance, it achieves state-of-the-art results across the three most commonly used datasets for pedestrian crossing intention prediction.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes