CVDec 2, 2019

MnasFPN: Learning Latency-aware Pyramid Architecture for Object Detection on Mobile Devices

arXiv:1912.01106v258 citations
Originality Highly original
AI Analysis

This work addresses the need for automated, latency-aware object detection models on mobile devices, representing an incremental improvement over prior manual or non-mobile-friendly automated designs.

The paper tackled the problem of designing efficient object detection architectures for mobile devices by proposing MnasFPN, a mobile-friendly search space combined with latency-aware architecture search, resulting in models that outperform MobileNetV3+SSDLite by 1.8 mAP at similar latency and are 1.0 mAP more accurate and 10% faster than NAS-FPNLite.

Despite the blooming success of architecture search for vision tasks in resource-constrained environments, the design of on-device object detection architectures have mostly been manual. The few automated search efforts are either centered around non-mobile-friendly search spaces or not guided by on-device latency. We propose MnasFPN, a mobile-friendly search space for the detection head, and combine it with latency-aware architecture search to produce efficient object detection models. The learned MnasFPN head, when paired with MobileNetV2 body, outperforms MobileNetV3+SSDLite by 1.8 mAP at similar latency on Pixel. It is also both 1.0 mAP more accurate and 10% faster than NAS-FPNLite. Ablation studies show that the majority of the performance gain comes from innovations in the search space. Further explorations reveal an interesting coupling between the search space design and the search algorithm, and that the complexity of MnasFPN search space may be at a local optimum.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes