VConv-DAE: Deep Volumetric Shape Learning Without Object Labels
This addresses the challenge of incomplete or noisy 3D captures for applications in computer vision and robotics, though it is incremental as it builds on existing deep learning approaches for volumetric data.
The paper tackles the problem of learning 3D shape representations from noisy depth data without relying on object labels, and the result is a method that outperforms prior work on tasks like denoising and shape completion, with competitive classification performance.
With the advent of affordable depth sensors, 3D capture becomes more and more ubiquitous and already has made its way into commercial products. Yet, capturing the geometry or complete shapes of everyday objects using scanning devices (e.g. Kinect) still comes with several challenges that result in noise or even incomplete shapes. Recent success in deep learning has shown how to learn complex shape distributions in a data-driven way from large scale 3D CAD Model collections and to utilize them for 3D processing on volumetric representations and thereby circumventing problems of topology and tessellation. Prior work has shown encouraging results on problems ranging from shape completion to recognition. We provide an analysis of such approaches and discover that training as well as the resulting representation are strongly and unnecessarily tied to the notion of object labels. Thus, we propose a full convolutional volumetric auto encoder that learns volumetric representation from noisy data by estimating the voxel occupancy grids. The proposed method outperforms prior work on challenging tasks like denoising and shape completion. We also show that the obtained deep embedding gives competitive performance when used for classification and promising results for shape interpolation.