Structured Feature Learning for Pose Estimation
This work addresses pose estimation for computer vision applications, offering a novel method that improves accuracy but is incremental in its approach.
The paper tackles the problem of human pose estimation by proposing a structured feature learning framework that captures correlations among body joints at the feature level, resulting in an 18% improvement in mean PCP on the FLIC dataset compared to baseline methods.
In this paper, we propose a structured feature learning framework to reason the correlations among body joints at the feature level in human pose estimation. Different from existing approaches of modelling structures on score maps or predicted labels, feature maps preserve substantially richer descriptions of body joints. The relationships between feature maps of joints are captured with the introduced geometrical transform kernels, which can be easily implemented with a convolution layer. Features and their relationships are jointly learned in an end-to-end learning system. A bi-directional tree structured model is proposed, so that the feature channels at a body joint can well receive information from other joints. The proposed framework improves feature learning substantially. With very simple post processing, it reaches the best mean PCP on the LSP and FLIC datasets. Compared with the baseline of learning features at each joint separately with ConvNet, the mean PCP has been improved by 18% on FLIC. The code is released to the public.