CVMar 28, 2024

GeoAuxNet: Towards Universal 3D Representation Learning for Multi-sensor Point Clouds

arXiv:2403.19220v19 citationsh-index: 13CVPR
Originality Incremental advance
AI Analysis

This addresses the challenge of universal 3D representation learning for multi-sensor point clouds, but it appears incremental as it builds on existing voxel-based methods.

The paper tackles the problem of domain gaps in point clouds from different sensors like RGB-D cameras and LiDAR by proposing GeoAuxNet, which uses geometry-to-voxel auxiliary learning to improve voxel-based backbones with point-level geometric information, achieving competitive results with state-of-the-art experts on single datasets.

Point clouds captured by different sensors such as RGB-D cameras and LiDAR possess non-negligible domain gaps. Most existing methods design different network architectures and train separately on point clouds from various sensors. Typically, point-based methods achieve outstanding performances on even-distributed dense point clouds from RGB-D cameras, while voxel-based methods are more efficient for large-range sparse LiDAR point clouds. In this paper, we propose geometry-to-voxel auxiliary learning to enable voxel representations to access point-level geometric information, which supports better generalisation of the voxel-based backbone with additional interpretations of multi-sensor point clouds. Specifically, we construct hierarchical geometry pools generated by a voxel-guided dynamic point network, which efficiently provide auxiliary fine-grained geometric information adapted to different stages of voxel features. We conduct experiments on joint multi-sensor datasets to demonstrate the effectiveness of GeoAuxNet. Enjoying elaborate geometric information, our method outperforms other models collectively trained on multi-sensor datasets, and achieve competitive results with the-state-of-art experts on each single dataset.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes