CVNov 5, 2018

SPNet: Deep 3D Object Classification and Retrieval using Stereographic Projection

arXiv:1811.01571v279 citations
Originality Incremental advance
AI Analysis

This addresses efficient 3D object analysis for computer vision applications, but it is incremental as it builds on existing projection and CNN techniques.

The paper tackled 3D object classification and retrieval by proposing SPNet, which uses stereographic projection to transform 3D volumes into 2D images and a shallow CNN with view ensemble, achieving performance comparable to state-of-the-art methods with substantially lower GPU memory and network parameters.

We propose an efficient Stereographic Projection Neural Network (SPNet) for learning representations of 3D objects. We first transform a 3D input volume into a 2D planar image using stereographic projection. We then present a shallow 2D convolutional neural network (CNN) to estimate the object category followed by view ensemble, which combines the responses from multiple views of the object to further enhance the predictions. Specifically, the proposed approach consists of four stages: (1) Stereographic projection of a 3D object, (2) view-specific feature learning, (3) view selection and (4) view ensemble. The proposed approach performs comparably to the state-of-the-art methods while having substantially lower GPU memory as well as network parameters. Despite its lightness, the experiments on 3D object classification and shape retrievals demonstrate the high performance of the proposed method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes