CVFeb 27, 2025
PI-HMR: Towards Robust In-bed Temporal Human Shape Reconstruction with Contact Pressure SensingZiyu Wu, Yufan Xiong, Mengting Niu et al.
Long-term in-bed monitoring benefits automatic and real-time health management within healthcare, and the advancement of human shape reconstruction technologies further enhances the representation and visualization of users' activity patterns. However, existing technologies are primarily based on visual cues, facing serious challenges in non-light-of-sight and privacy-sensitive in-bed scenes. Pressure-sensing bedsheets offer a promising solution for real-time motion reconstruction. Yet, limited exploration in model designs and data have hindered its further development. To tackle these issues, we propose a general framework that bridges gaps in data annotation and model design. Firstly, we introduce SMPLify-IB, an optimization method that overcomes the depth ambiguity issue in top-view scenarios through gravity constraints, enabling generating high-quality 3D human shape annotations for in-bed datasets. Then we present PI-HMR, a temporal-based human shape estimator to regress meshes from pressure sequences. By integrating multi-scale feature fusion with high-pressure distribution and spatial position priors, PI-HMR outperforms SOTA methods with 17.01mm Mean-Per-Joint-Error decrease. This work provides a whole
CVDec 17, 2025
From Camera to World: A Plug-and-Play Module for Human Mesh TransformationChanghai Ma, Ziyu Wu, Yunkang Zhang et al.
Reconstructing accurate 3D human meshes in the world coordinate system from in-the-wild images remains challenging due to the lack of camera rotation information. While existing methods achieve promising results in the camera coordinate system by assuming zero camera rotation, this simplification leads to significant errors when transforming the reconstructed mesh to the world coordinate system. To address this challenge, we propose Mesh-Plug, a plug-and-play module that accurately transforms human meshes from camera coordinates to world coordinates. Our key innovation lies in a human-centered approach that leverages both RGB images and depth maps rendered from the initial mesh to estimate camera rotation parameters, eliminating the dependency on environmental cues. Specifically, we first train a camera rotation prediction module that focuses on the human body's spatial configuration to estimate camera pitch angle. Then, by integrating the predicted camera parameters with the initial mesh, we design a mesh adjustment module that simultaneously refines the root joint orientation and body pose. Extensive experiments demonstrate that our framework outperforms state-of-the-art methods on the benchmark datasets SPEC-SYN and SPEC-MTP.
CVSep 4, 2025
WATCH: World-aware Allied Trajectory and pose reconstruction for Camera and HumanQijun Ying, Zhongyuan Hu, Rui Zhang et al.
Global human motion reconstruction from in-the-wild monocular videos is increasingly demanded across VR, graphics, and robotics applications, yet requires accurate mapping of human poses from camera to world coordinates-a task challenged by depth ambiguity, motion ambiguity, and the entanglement between camera and human movements. While human-motion-centric approaches excel in preserving motion details and physical plausibility, they suffer from two critical limitations: insufficient exploitation of camera orientation information and ineffective integration of camera translation cues. We present WATCH (World-aware Allied Trajectory and pose reconstruction for Camera and Human), a unified framework addressing both challenges. Our approach introduces an analytical heading angle decomposition technique that offers superior efficiency and extensibility compared to existing geometric methods. Additionally, we design a camera trajectory integration mechanism inspired by world models, providing an effective pathway for leveraging camera translation information beyond naive hard-decoding approaches. Through experiments on in-the-wild benchmarks, WATCH achieves state-of-the-art performance in end-to-end trajectory reconstruction. Our work demonstrates the effectiveness of jointly modeling camera-human motion relationships and offers new insights for addressing the long-standing challenge of camera translation integration in global human motion reconstruction. The code will be available publicly.