NRMVS: Non-Rigid Multi-View Stereo
This addresses a limitation in multi-view stereo for dynamic scenes like clothes or human bodies, opening a new direction in computer vision.
The paper tackles dense 3D reconstruction of scenes with non-rigid motion from sparse RGB images, proposing a joint optimization method for deformation and depth estimation, and demonstrates the ability to create dense 4D structures and interpolate novel deformed scenes.
Scene reconstruction from unorganized RGB images is an important task in many computer vision applications. Multi-view Stereo (MVS) is a common solution in photogrammetry applications for the dense reconstruction of a static scene. The static scene assumption, however, limits the general applicability of MVS algorithms, as many day-to-day scenes undergo non-rigid motion, e.g., clothes, faces, or human bodies. In this paper, we open up a new challenging direction: dense 3D reconstruction of scenes with non-rigid changes observed from arbitrary, sparse, and wide-baseline views. We formulate the problem as a joint optimization of deformation and depth estimation, using deformation graphs as the underlying representation. We propose a new sparse 3D to 2D matching technique, together with a dense patch-match evaluation scheme to estimate deformation and depth with photometric consistency. We show that creating a dense 4D structure from a few RGB images with non-rigid changes is possible, and demonstrate that our method can be used to interpolate novel deformed scenes from various combinations of these deformation estimates derived from the sparse views.