CVMar 21, 2025

Salient Object Detection in Traffic Scene through the TSOD10K Dataset

Yu Qiu, Yuhang Sun, Jie Mei, Lin Xiao, Jing Xu

arXiv:2503.16910v18.43 citationsh-index: 11IEEE Transactions on Image Processing

Originality Incremental advance

AI Analysis

This establishes the first foundation for safety-aware saliency analysis in intelligent transportation systems, addressing a domain-specific need for autonomous and assisted driving.

The paper tackles the problem of detecting safety-critical objects in traffic scenes by introducing the first large-scale Traffic Salient Object Detection (TSOD) dataset called TSOD10K, and proposes a Mamba-based model called Tramba that outperforms 22 existing models on this benchmark.

Traffic Salient Object Detection (TSOD) aims to segment the objects critical to driving safety by combining semantic (e.g., collision risks) and visual saliency. Unlike SOD in natural scene images (NSI-SOD), which prioritizes visually distinctive regions, TSOD emphasizes the objects that demand immediate driver attention due to their semantic impact, even with low visual contrast. This dual criterion, i.e., bridging perception and contextual risk, re-defines saliency for autonomous and assisted driving systems. To address the lack of task-specific benchmarks, we collect the first large-scale TSOD dataset with pixel-wise saliency annotations, named TSOD10K. TSOD10K covers the diverse object categories in various real-world traffic scenes under various challenging weather/illumination variations (e.g., fog, snowstorms, low-contrast, and low-light). Methodologically, we propose a Mamba-based TSOD model, termed Tramba. Considering the challenge of distinguishing inconspicuous visual information from complex traffic backgrounds, Tramba introduces a novel Dual-Frequency Visual State Space module equipped with shifted window partitioning and dilated scanning to enhance the perception of fine details and global structure by hierarchically decomposing high/low-frequency components. To emphasize critical regions in traffic scenes, we propose a traffic-oriented Helix 2D-Selective-Scan (Helix-SS2D) mechanism that injects driving attention priors while effectively capturing global multi-direction spatial dependencies. We establish a comprehensive benchmark by evaluating Tramba and 22 existing NSI-SOD models on TSOD10K, demonstrating Tramba's superiority. Our research establishes the first foundation for safety-aware saliency analysis in intelligent transportation systems.

View on arXiv PDF

Similar