CV LGJun 26, 2020

Making DensePose fast and light

Ruslan Rakhimov, Emil Bogomolov, Alexandr Notchenko, Fung Mao, Alexey Artemov, Denis Zorin, Evgeny Burnaev

arXiv:2006.15190v32.31 citationsh-index: 56Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of enabling real-time, on-device DensePose inference for applications like augmented reality and cloth fitting, which is incremental by optimizing an existing model architecture.

The authors tackled the problem of making DensePose estimation models more efficient for deployment on mobile and embedded devices, achieving a 17x reduction in model size and a 2x improvement in latency compared to the baseline while retaining most of its accuracy.

DensePose estimation task is a significant step forward for enhancing user experience computer vision applications ranging from augmented reality to cloth fitting. Existing neural network models capable of solving this task are heavily parameterized and a long way from being transferred to an embedded or mobile device. To enable Dense Pose inference on the end device with current models, one needs to support an expensive server-side infrastructure and have a stable internet connection. To make things worse, mobile and embedded devices do not always have a powerful GPU inside. In this work, we target the problem of redesigning the DensePose R-CNN model's architecture so that the final network retains most of its accuracy but becomes more light-weight and fast. To achieve that, we tested and incorporated many deep learning innovations from recent years, specifically performing an ablation study on 23 efficient backbone architectures, multiple two-stage detection pipeline modifications, and custom model quantization methods. As a result, we achieved $17\times$ model size reduction and $2\times$ latency improvement compared to the baseline model.

View on arXiv PDF Code

Similar