Interweaved Graph and Attention Network for 3D Human Pose Estimation
This work addresses a specific bottleneck in 3D human pose estimation for computer vision applications, representing an incremental improvement over prior methods.
The paper tackles the problem of insufficient learning of human skeleton representations in 3D human pose estimation from single-view images by proposing IGANet, which integrates graph convolutional networks and attentions for bidirectional communication, achieving state-of-the-art performance on Human3.6M and MPI-INF-3DHP datasets.
Despite substantial progress in 3D human pose estimation from a single-view image, prior works rarely explore global and local correlations, leading to insufficient learning of human skeleton representations. To address this issue, we propose a novel Interweaved Graph and Attention Network (IGANet) that allows bidirectional communications between graph convolutional networks (GCNs) and attentions. Specifically, we introduce an IGA module, where attentions are provided with local information from GCNs and GCNs are injected with global information from attentions. Additionally, we design a simple yet effective U-shaped multi-layer perceptron (uMLP), which can capture multi-granularity information for body joints. Extensive experiments on two popular benchmark datasets (i.e. Human3.6M and MPI-INF-3DHP) are conducted to evaluate our proposed method.The results show that IGANet achieves state-of-the-art performance on both datasets. Code is available at https://github.com/xiu-cs/IGANet.