DeepDeform: Learning Non-rigid RGB-D Reconstruction with Semi-supervised Data
This work solves the challenge of reconstructing highly non-rigid deformations in 3D for computer vision applications, representing a strong specific gain in the field.
The paper tackles the problem of non-rigid 3D reconstruction from RGB-D data by addressing the lack of large-scale training data, introducing a semi-supervised strategy to create a dataset of 400 scenes and over 390,000 frames, and proposing a neural network that significantly outperforms existing methods.
Applying data-driven approaches to non-rigid 3D reconstruction has been difficult, which we believe can be attributed to the lack of a large-scale training corpus. Unfortunately, this method fails for important cases such as highly non-rigid deformations. We first address this problem of lack of data by introducing a novel semi-supervised strategy to obtain dense inter-frame correspondences from a sparse set of annotations. This way, we obtain a large dataset of 400 scenes, over 390,000 RGB-D frames, and 5,533 densely aligned frame pairs; in addition, we provide a test set along with several metrics for evaluation. Based on this corpus, we introduce a data-driven non-rigid feature matching approach, which we integrate into an optimization-based reconstruction pipeline. Here, we propose a new neural network that operates on RGB-D frames, while maintaining robustness under large non-rigid deformations and producing accurate predictions. Our approach significantly outperforms existing non-rigid reconstruction methods that do not use learned data terms, as well as learning-based approaches that only use self-supervision.