CVDec 11, 2025

Point2Pose: A Generative Framework for 3D Human Pose Estimation with Multi-View Point Cloud Dataset

arXiv:2512.10321v11 citationsh-index: 1
Originality Incremental advance
AI Analysis

This work addresses 3D human pose estimation for applications like motion capture or robotics, but it appears incremental as it builds on existing generative and attention-based methods.

The paper tackles 3D human pose estimation by proposing Point2Pose, a generative framework that models human poses from sequential point clouds and pose history, and it outperforms baseline models on various datasets.

We propose a novel generative approach for 3D human pose estimation. 3D human pose estimation poses several key challenges due to the complex geometry of the human body, self-occluding joints, and the requirement for large-scale real-world motion datasets. To address these challenges, we introduce Point2Pose, a framework that effectively models the distribution of human poses conditioned on sequential point cloud and pose history. Specifically, we employ a spatio-temporal point cloud encoder and a pose feature encoder to extract joint-wise features, followed by an attention-based generative regressor. Additionally, we present a large-scale indoor dataset MVPose3D, which contains multiple modalities, including IMU data of non-trivial human motions, dense multi-view point clouds, and RGB images. Experimental results show that the proposed method outperforms the baseline models, demonstrating its superior performance across various datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes