NodeSLAM: Neural Object Descriptors for Multi-View Shape Reconstruction
This work addresses the need for efficient and optimizable object representations in computer vision, enabling multi-view shape reconstruction for robotics and augmented reality, with incremental improvements in joint optimization.
The paper tackles the problem of 3D object shape reconstruction from RGB-D images by introducing learned object descriptors and a probabilistic rendering engine, achieving accurate and robust reconstruction that enables applications like robot grasping and object-level SLAM.
The choice of scene representation is crucial in both the shape inference algorithms it requires and the smart applications it enables. We present efficient and optimisable multi-class learned object descriptors together with a novel probabilistic and differential rendering engine, for principled full object shape inference from one or more RGB-D images. Our framework allows for accurate and robust 3D object reconstruction which enables multiple applications including robot grasping and placing, augmented reality, and the first object-level SLAM system capable of optimising object poses and shapes jointly with camera trajectory.