CVMLMay 30, 2018

Multi-level 3D CNN for Learning Multi-scale Spatial Features

arXiv:1805.12254v29 citations
Originality Incremental advance
AI Analysis

This addresses memory efficiency in 3D object recognition for computer vision applications, but it is incremental as it builds on existing voxel-based methods.

The paper tackles 3D object recognition by proposing a multi-level voxel grid approach to learn multi-scale spatial features, achieving performance comparable to dense voxel representations with significantly lower memory usage.

3D object recognition accuracy can be improved by learning the multi-scale spatial features from 3D spatial geometric representations of objects such as point clouds, 3D models, surfaces, and RGB-D data. Current deep learning approaches learn such features either using structured data representations (voxel grids and octrees) or from unstructured representations (graphs and point clouds). Learning features from such structured representations is limited by the restriction on resolution and tree depth while unstructured representations creates a challenge due to non-uniformity among data samples. In this paper, we propose an end-to-end multi-level learning approach on a multi-level voxel grid to overcome these drawbacks. To demonstrate the utility of the proposed multi-level learning, we use a multi-level voxel representation of 3D objects to perform object recognition. The multi-level voxel representation consists of a coarse voxel grid that contains volumetric information of the 3D object. In addition, each voxel in the coarse grid that contains a portion of the object boundary is subdivided into multiple fine-level voxel grids. The performance of our multi-level learning algorithm for object recognition is comparable to dense voxel representations while using significantly lower memory.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes