LG CV MMApr 22, 2020

Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization

Wei Niu, Pu Zhao, Zheng Zhan, Xue Lin, Yanzhi Wang, Bin Ren

arXiv:2004.11250v12.31 citations

Originality Synthesis-oriented

AI Analysis

This addresses the problem of constrained computation and storage for mobile DNN applications, but it appears incremental as it builds on existing pruning and optimization methods.

The paper tackled the challenge of real-time DNN inference on mobile platforms by proposing structured model pruning and compiler optimization techniques, achieving real-time execution for applications like style transfer and super resolution.

High-end mobile platforms rapidly serve as primary computing devices for a wide range of Deep Neural Network (DNN) applications. However, the constrained computation and storage resources on these devices still pose significant challenges for real-time DNN inference executions. To address this problem, we propose a set of hardware-friendly structured model pruning and compiler optimization techniques to accelerate DNN executions on mobile devices. This demo shows that these optimizations can enable real-time mobile execution of multiple DNN applications, including style transfer, DNN coloring and super resolution.

View on arXiv PDF

Similar