Bo Shang

25.2ETApr 12

Roadside LiDAR for Cooperative Safety Auditing at Urban Intersections: Toward Auditable V2X Infrastructure Intelligence

Bo Shang, Yiqiao Li

Urban intersections expose the limitations of single-vehicle perception under occlusion and partial observability. In this study, we present an auditable roadside LiDAR framework for infrastructure-assisted safety analysis at a signalized urban intersection in New York City, developed and evaluated using real-world data. The proposed framework integrates trajectory construction, iterative human-in-the-loop quality assurance (QA), and interpretable near-miss analytics to produce defensible safety evidence from infrastructure sensing. Using a human-labeled heavy vehicle--bicycle interaction as an anchor case, we show that direction-agnostic time-to-collision (TTC) drops below 1s, while longitudinal TTC remains above conservative braking thresholds, revealing a lateral-intrusion-dominated conflict mechanism. Beyond individual cases, continuous-window evaluation and multi-round QA analysis demonstrate that the framework systematically reduces failure modes such as track fragmentation, spurious TTC triggers, unstable geometry, and cross-lane false conflicts. These results position roadside LiDAR as a practical post-hoc auditing mechanism for cooperative perception systems, with broader statistical validation discussed. This work provides a pathway toward scalable, data-driven safety auditing of urban intersections, enabling transportation agencies to identify and mitigate high-risk interactions beyond crash-based analyses.

CVFeb 10

Bridging the Modality Gap in Roadside LiDAR: A Training-Free Vision-Language Model Framework for Vehicle Classification

Yiqiao Li, Bo Shang, Jie Wei

Fine-grained truck classification is critical for intelligent transportation systems (ITS), yet current LiDAR-based methods face scalability challenges due to their reliance on supervised deep learning and labor-intensive manual annotation. Vision-Language Models (VLMs) offer promising few-shot generalization, but their application to roadside LiDAR is limited by a modality gap between sparse 3D point clouds and dense 2D imagery. We propose a framework that bridges this gap by adapting off-the-shelf VLMs for fine-grained truck classification without parameter fine-tuning. Our new depth-aware image generation pipeline applies noise removal, spatial and temporal registration, orientation rectification, morphological operations, and anisotropic smoothing to transform sparse, occluded LiDAR scans into depth-encoded 2D visual proxies. Validated on a real-world dataset of 20 vehicle classes, our approach achieves competitive classification accuracy with as few as 16-30 examples per class, offering a scalable alternative to data-intensive supervised baselines. We further observe a "Semantic Anchor" effect: text-based guidance regularizes performance in ultra-low-shot regimes $k < 4$, but degrades accuracy in more-shot settings due to semantic mismatch. Furthermore, we demonstrate the efficacy of this framework as a Cold Start strategy, using VLM-generated labels to bootstrap lightweight supervised models. Notably, the few-shot VLM-based model achieves over correct classification rate of 75 percent for specific drayage categories (20ft, 40ft, and 53ft containers) entirely without the costly training or fine-tuning, significantly reducing the intensive demands of initial manual labeling, thus achieving a method of practical use in ITS applications.

Bo Shang

2 Papers