CVNov 24, 2020

Canonical Voting: Towards Robust Oriented Bounding Box Detection in 3D Scenes

arXiv:2011.12001v314 citationsHas Code
AI Analysis

This work provides a more robust method for oriented bounding box detection, which is crucial for applications requiring precise 3D object localization in real-world scenes.

This paper addresses the inaccuracy of offset and orientation predictions in 3D object detection by disentangling direct offset into Local Canonical Coordinates (LCC), box scales, and box orientations. It introduces a canonical voting scheme to generate box orientations and an LCC-aware back-projection checking algorithm to refine bounding boxes, achieving state-of-the-art performance on ScanNet, SceneNN, and SUN RGB-D benchmarks.

3D object detection has attracted much attention thanks to the advances in sensors and deep learning methods for point clouds. Current state-of-the-art methods like VoteNet regress direct offset towards object centers and box orientations with an additional Multi-Layer-Perceptron network. Both their offset and orientation predictions are not accurate due to the fundamental difficulty in rotation classification. In the work, we disentangle the direct offset into Local Canonical Coordinates (LCC), box scales and box orientations. Only LCC and box scales are regressed, while box orientations are generated by a canonical voting scheme. Finally, an LCC-aware back-projection checking algorithm iteratively cuts out bounding boxes from the generated vote maps, with the elimination of false positives. Our model achieves state-of-the-art performance on three standard real-world benchmarks: ScanNet, SceneNN and SUN RGB-D. Our code is available on https://github.com/qq456cvb/CanonicalVoting.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes