CVAIDec 19, 2021

Parallel Multi-Scale Networks with Deep Supervision for Hand Keypoint Detection

arXiv:2112.10275v12 citations
Originality Incremental advance
AI Analysis

This work addresses hand keypoint detection for applications like neuroscience studies, but it is incremental as it builds on existing feature fusion approaches.

The paper tackles the problem of hand keypoint detection for small objects by proposing P-MSDSNet, a CNN model that uses multi-scale deep supervision with attention maps for adaptive feature propagation, which outperforms state-of-the-art methods on benchmark datasets with fewer parameters.

Keypoint detection plays an important role in a wide range of applications. However, predicting keypoints of small objects such as human hands is a challenging problem. Recent works fuse feature maps of deep Convolutional Neural Networks (CNNs), either via multi-level feature integration or multi-resolution aggregation. Despite achieving some success, the feature fusion approaches increase the complexity and the opacity of CNNs. To address this issue, we propose a novel CNN model named Multi-Scale Deep Supervision Network (P-MSDSNet) that learns feature maps at different scales with deep supervisions to produce attention maps for adaptive feature propagation from layers to layers. P-MSDSNet has a multi-stage architecture which makes it scalable while its deep supervision with spatial attention improves transparency to the feature learning at each stage. We show that P-MSDSNet outperforms the state-of-the-art approaches on benchmark datasets while requiring fewer number of parameters. We also show the application of P-MSDSNet to quantify finger tapping hand movements in a neuroscience study.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes