S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human Modeling
This addresses the need for scalable human modeling in applications like virtual reality and robotics simulation, offering a method that handles diverse shapes and poses without relying on parametric models.
The paper tackles the problem of automatically reconstructing and animating 3D humans from real-world data by representing shape, pose, and skinning weights as neural implicit functions, and it demonstrates that this approach outperforms existing state-of-the-art methods in reconstructions and enables animation from a single RGB image.
Constructing and animating humans is an important component for building virtual worlds in a wide variety of applications such as virtual reality or robotics testing in simulation. As there are exponentially many variations of humans with different shape, pose and clothing, it is critical to develop methods that can automatically reconstruct and animate humans at scale from real world data. Towards this goal, we represent the pedestrian's shape, pose and skinning weights as neural implicit functions that are directly learned from data. This representation enables us to handle a wide variety of different pedestrian shapes and poses without explicitly fitting a human parametric body model, allowing us to handle a wider range of human geometries and topologies. We demonstrate the effectiveness of our approach on various datasets and show that our reconstructions outperform existing state-of-the-art methods. Furthermore, our re-animation experiments show that we can generate 3D human animations at scale from a single RGB image (and/or an optional LiDAR sweep) as input.