CVAIMay 18, 2021

IntFormer: Predicting pedestrian intention with the aid of the Transformer architecture

arXiv:2105.08647v140 citations
Originality Incremental advance
AI Analysis

This work addresses pedestrian safety and traffic flow for intelligent vehicles, but it is incremental as it builds on existing transformer and convolutional methods.

The paper tackled pedestrian crossing intention prediction by developing IntFormer, a transformer-based model combined with RubiksNet, achieving state-of-the-art results with approximately 40 sequences per second and being 8 times smaller than the best model.

Understanding pedestrian crossing behavior is an essential goal in intelligent vehicle development, leading to an improvement in their security and traffic flow. In this paper, we developed a method called IntFormer. It is based on transformer architecture and a novel convolutional video classification model called RubiksNet. Following the evaluation procedure in a recent benchmark, we show that our model reaches state-of-the-art results with good performance ($\approx 40$ seq. per second) and size ($8\times $smaller than the best performing model), making it suitable for real-time usage. We also explore each of the input features, finding that ego-vehicle speed is the most important variable, possibly due to the similarity in crossing cases in PIE dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes