CVMay 31, 2022

Voxel Field Fusion for 3D Object Detection

arXiv:2205.15938v1111 citationsh-index: 106Has Code
Originality Incremental advance
AI Analysis

This addresses the problem of cross-modality consistency in 3D object detection for autonomous driving, with incremental improvements over existing fusion methods.

The paper tackles cross-modality 3D object detection by introducing voxel field fusion, which maintains consistency by representing image features as rays in a voxel field and using ray-wise fusion. It achieves consistent gains and outperforms previous fusion-based methods on KITTI and nuScenes datasets.

In this work, we present a conceptually simple yet effective framework for cross-modality 3D object detection, named voxel field fusion. The proposed approach aims to maintain cross-modality consistency by representing and fusing augmented image features as a ray in the voxel field. To this end, the learnable sampler is first designed to sample vital features from the image plane that are projected to the voxel grid in a point-to-ray manner, which maintains the consistency in feature representation with spatial context. In addition, ray-wise fusion is conducted to fuse features with the supplemental context in the constructed voxel field. We further develop mixed augmentor to align feature-variant transformations, which bridges the modality gap in data augmentation. The proposed framework is demonstrated to achieve consistent gains in various benchmarks and outperforms previous fusion-based methods on KITTI and nuScenes datasets. Code is made available at https://github.com/dvlab-research/VFF.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes