Benchmarking of DL Libraries and Models on Mobile Devices
This work addresses the problem of selecting optimal deep learning libraries for mobile deployment, providing practical insights for developers and researchers, though it is incremental as it benchmarks existing tools without introducing new methods.
The paper tackled the lack of quantitative performance analysis of deep learning libraries on mobile devices by benchmarking 6 libraries and 15 models across 10 devices, revealing significant performance fragmentation and gaps that can outweigh algorithmic or hardware optimizations.
Deploying deep learning (DL) on mobile devices has been a notable trend in recent years. To support fast inference of on-device DL, DL libraries play a critical role as algorithms and hardware do. Unfortunately, no prior work ever dives deep into the ecosystem of modern DL libs and provides quantitative results on their performance. In this paper, we first build a comprehensive benchmark that includes 6 representative DL libs and 15 diversified DL models. We then perform extensive experiments on 10 mobile devices, which help reveal a complete landscape of the current mobile DL libs ecosystem. For example, we find that the best-performing DL lib is severely fragmented across different models and hardware, and the gap between those DL libs can be rather huge. In fact, the impacts of DL libs can overwhelm the optimizations from algorithms or hardware, e.g., model quantization and GPU/DSP-based heterogeneous computing. Finally, atop the observations, we summarize practical implications to different roles in the DL lib ecosystem.