Body Knowledge and Uncertainty Modeling for Monocular 3D Human Body Reconstruction
This work addresses the difficulty of acquiring accurate 3D data for training reconstruction models, which is a problem for researchers and practitioners in computer vision and human pose estimation, though it appears incremental as it builds on weakly-supervised approaches.
The paper tackles the problem of insufficient 3D supervision for monocular 3D human body reconstruction by proposing KNOWN, a framework that uses body knowledge constraints and uncertainty modeling to train without 3D data, resulting in improved performance over prior weakly-supervised methods, especially on challenging minority images.
While 3D body reconstruction methods have made remarkable progress recently, it remains difficult to acquire the sufficiently accurate and numerous 3D supervisions required for training. In this paper, we propose \textbf{KNOWN}, a framework that effectively utilizes body \textbf{KNOW}ledge and u\textbf{N}certainty modeling to compensate for insufficient 3D supervisions. KNOWN exploits a comprehensive set of generic body constraints derived from well-established body knowledge. These generic constraints precisely and explicitly characterize the reconstruction plausibility and enable 3D reconstruction models to be trained without any 3D data. Moreover, existing methods typically use images from multiple datasets during training, which can result in data noise (\textit{e.g.}, inconsistent joint annotation) and data imbalance (\textit{e.g.}, minority images representing unusual poses or captured from challenging camera views). KNOWN solves these problems through a novel probabilistic framework that models both aleatoric and epistemic uncertainty. Aleatoric uncertainty is encoded in a robust Negative Log-Likelihood (NLL) training loss, while epistemic uncertainty is used to guide model refinement. Experiments demonstrate that KNOWN's body reconstruction outperforms prior weakly-supervised approaches, particularly on the challenging minority images.