CVLGSep 16, 2021

Lifting 2D Object Locations to 3D by Discounting LiDAR Outliers across Objects and Views

arXiv:2109.07945v210 citations
AI Analysis

This addresses the challenge of 3D object localization from sparse sensor data for autonomous driving or robotics, though it appears incremental with specific enhancements to existing approaches.

The paper tackles the problem of converting 2D object masks and partial LiDAR point clouds into accurate 3D bounding boxes by sharing information across all objects and frames, resulting in significant performance improvements over previous methods that used more complex pipelines and external data.

We present a system for automatic converting of 2D mask object predictions and raw LiDAR point clouds into full 3D bounding boxes of objects. Because the LiDAR point clouds are partial, directly fitting bounding boxes to the point clouds is meaningless. Instead, we suggest that obtaining good results requires sharing information between \emph{all} objects in the dataset jointly, over multiple frames. We then make three improvements to the baseline. First, we address ambiguities in predicting the object rotations via direct optimization in this space while still backpropagating rotation prediction through the model. Second, we explicitly model outliers and task the network with learning their typical patterns, thus better discounting them. Third, we enforce temporal consistency when video data is available. With these contributions, our method significantly outperforms previous work despite the fact that those methods use significantly more complex pipelines, 3D models and additional human-annotated external sources of prior information.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes