RO CVNov 26, 2024

On-Road Object Importance Estimation: A New Dataset and A Model with Multi-Fold Top-Down Guidance

Zhixiong Nan, Yilong Chen, Tianfei Zhou, Tao Xiang

arXiv:2411.17152v12.21 citationsh-index: 4NIPS

Originality Incremental advance

AI Analysis

This work addresses a critical problem for autonomous driving systems by enhancing object importance estimation, though it is incremental as it builds on existing methods by adding new guidance factors.

The paper tackles on-road object importance estimation from driver-view videos by introducing a new large-scale dataset (TOI) and a model that integrates multi-fold top-down guidance factors (driver intention, semantic context, traffic rules) with bottom-up features, achieving a 23.1% AP improvement over state-of-the-art methods.

This paper addresses the problem of on-road object importance estimation, which utilizes video sequences captured from the driver's perspective as the input. Although this problem is significant for safer and smarter driving systems, the exploration of this problem remains limited. On one hand, publicly-available large-scale datasets are scarce in the community. To address this dilemma, this paper contributes a new large-scale dataset named Traffic Object Importance (TOI). On the other hand, existing methods often only consider either bottom-up feature or single-fold guidance, leading to limitations in handling highly dynamic and diverse traffic scenarios. Different from existing methods, this paper proposes a model that integrates multi-fold top-down guidance with the bottom-up feature. Specifically, three kinds of top-down guidance factors (ie, driver intention, semantic context, and traffic rule) are integrated into our model. These factors are important for object importance estimation, but none of the existing methods simultaneously consider them. To our knowledge, this paper proposes the first on-road object importance estimation model that fuses multi-fold top-down guidance factors with bottom-up feature. Extensive experiments demonstrate that our model outperforms state-of-the-art methods by large margins, achieving 23.1% Average Precision (AP) improvement compared with the recently proposed model (ie, Goal).

View on arXiv PDF

Similar