CVMar 8, 2022

Quantification of Occlusion Handling Capability of a 3D Human Pose Estimation Framework

arXiv:2203.04113v13.724 citationsh-index: 18Has Code

Originality Incremental advance

AI Analysis

This work addresses occlusion handling in 3D human pose estimation, which is a domain-specific problem for computer vision applications, but it appears incremental as it builds on existing occlusion-aware methods.

The paper tackles the problem of 3D human pose estimation from monocular images under occlusion by proposing an occlusion-guided framework that uses 2D skeletons with missing joints as input, achieving significantly improved action recognition performance in the presence of missing joints.

3D human pose estimation using monocular images is an important yet challenging task. Existing 3D pose detection methods exhibit excellent performance under normal conditions however their performance may degrade due to occlusion. Recently some occlusion aware methods have also been proposed, however, the occlusion handling capability of these networks has not yet been thoroughly investigated. In the current work, we propose an occlusion-guided 3D human pose estimation framework and quantify its occlusion handling capability by using different protocols. The proposed method estimates more accurate 3D human poses using 2D skeletons with missing joints as input. Missing joints are handled by introducing occlusion guidance that provides extra information about the absence or presence of a joint. Temporal information has also been exploited to better estimate the missing joints. A large number of experiments are performed for the quantification of occlusion handling capability of the proposed method on three publicly available datasets in various settings including random missing joints, fixed body parts missing, and complete frames missing, using mean per joint position error criterion. In addition to that, the quality of the predicted 3D poses is also evaluated using action classification performance as a criterion. 3D poses estimated by the proposed method achieved significantly improved action recognition performance in the presence of missing joints. Our experiments demonstrate the effectiveness of the proposed framework for handling the missing joints as well as quantification of the occlusion handling capability of the deep neural networks.

View on arXiv PDF Code

Similar