HTNet: Human Topology Aware Network for 3D Human Pose Estimation
It addresses error accumulation in 3D human pose estimation for computer vision applications, representing an incremental advance with specific gains.
The paper tackles error propagation in 3D human pose estimation by proposing HTNet, a network that uses human topology constraints at joint, part, and body levels, resulting in an 18.7% accuracy improvement on end joints and achieving state-of-the-art results on benchmark datasets.
3D human pose estimation errors would propagate along the human body topology and accumulate at the end joints of limbs. Inspired by the backtracking mechanism in automatic control systems, we design an Intra-Part Constraint module that utilizes the parent nodes as the reference to build topological constraints for end joints at the part level. Further considering the hierarchy of the human topology, joint-level and body-level dependencies are captured via graph convolutional networks and self-attentions, respectively. Based on these designs, we propose a novel Human Topology aware Network (HTNet), which adopts a channel-split progressive strategy to sequentially learn the structural priors of the human topology from multiple semantic levels: joint, part, and body. Extensive experiments show that the proposed method improves the estimation accuracy by 18.7% on the end joints of limbs and achieves state-of-the-art results on Human3.6M and MPI-INF-3DHP datasets. Code is available at https://github.com/vefalun/HTNet.