CVROJun 12, 2023

MaskedFusion360: Reconstruct LiDAR Data by Querying Camera Features

arXiv:2306.07087v11 citationsh-index: 5Has Code
Originality Incremental advance
AI Analysis

This work addresses the challenge of integrating semantic-rich camera data with accurate LiDAR data for improved perception in self-driving systems, representing an incremental advancement over existing fusion methods.

The paper tackles the problem of fusing LiDAR and camera data for self-driving applications by introducing a self-supervised method that reconstructs masked LiDAR data using fused features, reducing the need for complex spatial transformations.

In self-driving applications, LiDAR data provides accurate information about distances in 3D but lacks the semantic richness of camera data. Therefore, state-of-the-art methods for perception in urban scenes fuse data from both sensor types. In this work, we introduce a novel self-supervised method to fuse LiDAR and camera data for self-driving applications. We build upon masked autoencoders (MAEs) and train deep learning models to reconstruct masked LiDAR data from fused LiDAR and camera features. In contrast to related methods that use birds-eye-view representations, we fuse features from dense spherical LiDAR projections and features from fish-eye camera crops with a similar field of view. Therefore, we reduce the learned spatial transformations to moderate perspective transformations and do not require additional modules to generate dense LiDAR representations. Code is available at: https://github.com/KIT-MRT/masked-fusion-360

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes