Accurate Human Body Reconstruction for Volumetric Video
This work addresses the problem of improving reconstruction quality for volumetric video production, which is incremental as it builds on existing pipelines with new optimizations.
The paper tackled high-fidelity human body reconstruction for volumetric video by enhancing a professional pipeline with deep learning-based multi-view stereo networks and novel depth map post-processing, achieving high levels of geometric detail.
In this work, we enhance a professional end-to-end volumetric video production pipeline to achieve high-fidelity human body reconstruction using only passive cameras. While current volumetric video approaches estimate depth maps using traditional stereo matching techniques, we introduce and optimize deep learning-based multi-view stereo networks for depth map estimation in the context of professional volumetric video reconstruction. Furthermore, we propose a novel depth map post-processing approach including filtering and fusion, by taking into account photometric confidence, cross-view geometric consistency, foreground masks as well as camera viewing frustums. We show that our method can generate high levels of geometric detail for reconstructed human bodies.