CVJan 15

SVII-3D: Advancing Roadside Infrastructure Inventory with Decimeter-level 3D Localization and Comprehension from Sparse Street Imagery

Chong Liu, Luxuan Fu, Yang Jia, Zhen Dong, Bisheng Yang

arXiv:2601.10535v11.5h-index: 11

Originality Incremental advance

AI Analysis

This provides a scalable, cost-effective solution for smart city construction and facility management, though it appears incremental in advancing existing methods for roadside infrastructure.

The paper tackled the challenge of creating digital twins and asset inventories from sparse street imagery by proposing SVII-3D, a unified framework that improves identification accuracy and achieves decimeter-level 3D localization.

The automated creation of digital twins and precise asset inventories is a critical task in smart city construction and facility lifecycle management. However, utilizing cost-effective sparse imagery remains challenging due to limited robustness, inaccurate localization, and a lack of fine-grained state understanding. To address these limitations, SVII-3D, a unified framework for holistic asset digitization, is proposed. First, LoRA fine-tuned open-set detection is fused with a spatial-attention matching network to robustly associate observations across sparse views. Second, a geometry-guided refinement mechanism is introduced to resolve structural errors, achieving precise decimeter-level 3D localization. Third, transcending static geometric mapping, a Vision-Language Model agent leveraging multi-modal prompting is incorporated to automatically diagnose fine-grained operational states. Experiments demonstrate that SVII-3D significantly improves identification accuracy and minimizes localization errors. Consequently, this framework offers a scalable, cost-effective solution for high-fidelity infrastructure digitization, effectively bridging the gap between sparse perception and automated intelligent maintenance.

View on arXiv PDF

Similar