CVLGAug 17, 2023

MovePose: A High-performance Human Pose Estimation Algorithm on Mobile and Edge Devices

Meta AITsinghua
arXiv:2308.09084v48 citationsh-index: 106
Originality Incremental advance
AI Analysis

This addresses the need for efficient pose estimation on resource-constrained devices, but it is incremental as it builds on existing lightweight CNN methods with specific optimizations.

The paper tackles the problem of achieving high accuracy and real-time performance for human pose estimation on mobile and edge devices, resulting in a model that attains 68.0 mAP on COCO and runs at 69+ fps on a CPU and 11+ fps on a mobile phone.

We present MovePose, an optimized lightweight convolutional neural network designed specifically for real-time body pose estimation on CPU-based mobile devices. The current solutions do not provide satisfactory accuracy and speed for human posture estimation, and MovePose addresses this gap. It aims to maintain real-time performance while improving the accuracy of human posture estimation for mobile devices. Our MovePose algorithm has attained an Mean Average Precision (mAP) score of 68.0 on the COCO \cite{cocodata} validation dataset. The MovePose algorithm displayed efficiency with a performance of 69+ frames per second (fps) when run on an Intel i9-10920x CPU. Additionally, it showcased an increased performance of 452+ fps on an NVIDIA RTX3090 GPU. On an Android phone equipped with a Snapdragon 8 + 4G processor, the fps reached above 11. To enhance accuracy, we incorporated three techniques: deconvolution, large kernel convolution, and coordinate classification methods. Compared to basic upsampling, deconvolution is trainable, improves model capacity, and enhances the receptive field. Large kernel convolution strengthens these properties at a decreased computational cost. In summary, MovePose provides high accuracy and real-time performance, marking it a potential tool for a variety of applications, including those focused on mobile-side human posture estimation. The code and models for this algorithm will be made publicly accessible.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes