CVGRJun 9, 2024

Diverse 3D Human Pose Generation in Scenes based on Decoupled Structure

arXiv:2406.05691v11 citations
Originality Incremental advance
AI Analysis

It addresses the need for more varied and realistic human-scene interactions in computer vision and graphics, though it is incremental by building on existing datasets and methods.

This paper tackles the problem of limited diversity in generating 3D human poses in scenes by proposing a decoupled method that separates pose and interaction generation, resulting in more physically plausible interactions and diverse poses as validated on the PROX and MP3D-R datasets.

This paper presents a novel method for generating diverse 3D human poses in scenes with semantic control. Existing methods heavily rely on the human-scene interaction dataset, resulting in a limited diversity of the generated human poses. To overcome this challenge, we propose to decouple the pose and interaction generation process. Our approach consists of three stages: pose generation, contact generation, and putting human into the scene. We train a pose generator on the human dataset to learn rich pose prior, and a contact generator on the human-scene interaction dataset to learn human-scene contact prior. Finally, the placing module puts the human body into the scene in a suitable and natural manner. The experimental results on the PROX dataset demonstrate that our method produces more physically plausible interactions and exhibits more diverse human poses. Furthermore, experiments on the MP3D-R dataset further validates the generalization ability of our method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes