SOAR: Self-Occluded Avatar Recovery from a Single Video In the Wild
It addresses a common challenge in monocular human reconstruction for applications in the wild, though it appears incremental as it builds on existing priors and benchmarks.
The paper tackles the problem of reconstructing complete human avatars from single videos with self-occlusions, where parts of the body are unobserved, by introducing SOAR, which uses structural normal and generative diffusion priors to achieve results comparable to state-of-the-art and concurrent methods.
Self-occlusion is common when capturing people in the wild, where the performer do not follow predefined motion scripts. This challenges existing monocular human reconstruction systems that assume full body visibility. We introduce Self-Occluded Avatar Recovery (SOAR), a method for complete human reconstruction from partial observations where parts of the body are entirely unobserved. SOAR leverages structural normal prior and generative diffusion prior to address such an ill-posed reconstruction problem. For structural normal prior, we model human with an reposable surfel model with well-defined and easily readable shapes. For generative diffusion prior, we perform an initial reconstruction and refine it using score distillation. On various benchmarks, we show that SOAR performs favorably than state-of-the-art reconstruction and generation methods, and on-par comparing to concurrent works. Additional video results and code are available at https://soar-avatar.github.io/.