Learning 3D Shapes as Multi-Layered Height-maps using 2D Convolutional Networks
This work addresses the challenge of memory-efficient 3D shape analysis for computer vision and graphics applications, though it is incremental in improving existing representation methods.
The paper tackled the problem of representing 3D shapes for efficient processing by introducing a multi-layered height-map (MLH) representation that enables the use of 2D convolutional networks, achieving state-of-the-art classification results on the ModelNet dataset.
We present a novel global representation of 3D shapes, suitable for the application of 2D CNNs. We represent 3D shapes as multi-layered height-maps (MLH) where at each grid location, we store multiple instances of height maps, thereby representing 3D shape detail that is hidden behind several layers of occlusion. We provide a novel view merging method for combining view dependent information (Eg. MLH descriptors) from multiple views. Because of the ability of using 2D CNNs, our method is highly memory efficient in terms of input resolution compared to the voxel based input. Together with MLH descriptors and our multi view merging, we achieve the state-of-the-art result in classification on ModelNet dataset.