CVMay 6, 2025Code
OccCylindrical: Multi-Modal Fusion with Cylindrical Representation for 3D Semantic Occupancy PredictionZhenxing Ming, Julie Stephany Berrio, Mao Shan et al.
The safe operation of autonomous vehicles (AVs) is highly dependent on their understanding of the surroundings. For this, the task of 3D semantic occupancy prediction divides the space around the sensors into voxels, and labels each voxel with both occupancy and semantic information. Recent perception models have used multisensor fusion to perform this task. However, existing multisensor fusion-based approaches focus mainly on using sensor information in the Cartesian coordinate system. This ignores the distribution of the sensor readings, leading to a loss of fine-grained details and performance degradation. In this paper, we propose OccCylindrical that merges and refines the different modality features under cylindrical coordinates. Our method preserves more fine-grained geometry detail that leads to better performance. Extensive experiments conducted on the nuScenes dataset, including challenging rainy and nighttime scenarios, confirm our approach's effectiveness and state-of-the-art performance. The code will be available at: https://github.com/DanielMing123/OccCylindrical
CVMay 1, 2025
InterLoc: LiDAR-based Intersection Localization using Road Segmentation with Automated Evaluation MethodNguyen Hoang Khoi Tran, Julie Stephany Berrio, Mao Shan et al.
Online localization of road intersections is beneficial for autonomous vehicle localization, mapping and motion planning. Intersections offer strong landmarks for correcting vehicle pose estimation, anchoring new sensor data in up-to-date maps, and guiding vehicle routing in road network graphs. Despite this importance, intersection localization has not been widely studied, with existing methods either ignoring the rich semantic information already computed onboard or relying on scarce, hand-labeled intersection datasets. To close this gap, we present a novel LiDAR-based method for online vehicle-centric intersection localization. We detect the intersection candidates in a bird's eye view (BEV) representation formed by concatenating a sequence of semantic road scans. We then refine these candidates by analyzing the intersecting road branches and adjusting the intersection center point in a least-squares formulation. For evaluation, we introduce an automated pipeline that pairs localized intersection points with OpenStreetMap (OSM) intersection nodes using precise GNSS/INS ground-truth poses. Experiments on the SemanticKITTI dataset show that our method outperforms the latest learning-based baseline in accuracy and reliability. Sensitivity tests demonstrate the method's robustness to challenging segmentation errors, highlighting its applicability in the real world.
ROSep 17, 2025
InterKey: Cross-modal Intersection Keypoints for Global Localization on OpenStreetMapNguyen Hoang Khoi Tran, Julie Stephany Berrio, Mao Shan et al.
Reliable global localization is critical for autonomous vehicles, especially in environments where GNSS is degraded or unavailable, such as urban canyons and tunnels. Although high-definition (HD) maps provide accurate priors, the cost of data collection, map construction, and maintenance limits scalability. OpenStreetMap (OSM) offers a free and globally available alternative, but its coarse abstraction poses challenges for matching with sensor data. We propose InterKey, a cross-modal framework that leverages road intersections as distinctive landmarks for global localization. Our method constructs compact binary descriptors by jointly encoding road and building imprints from point clouds and OSM. To bridge modality gaps, we introduce discrepancy mitigation, orientation determination, and area-equalized sampling strategies, enabling robust cross-modal matching. Experiments on the KITTI dataset demonstrate that InterKey achieves state-of-the-art accuracy, outperforming recent baselines by a large margin. The framework generalizes to sensors that can produce dense structural point clouds, offering a scalable and cost-effective solution for robust vehicle localization.