CVApr 11, 2023

PP-MobileSeg: Explore the Fast and Accurate Semantic Segmentation Model on Mobile Devices

arXiv:2304.05152v122 citationsh-index: 12Has Code
Originality Incremental advance
AI Analysis

This work addresses the need for fast and accurate semantic segmentation models on mobile devices, offering an incremental improvement over existing methods.

The authors tackled the problem of adapting transformers for semantic segmentation on mobile devices by proposing PP-MobileSeg, which achieved state-of-the-art performance with 1.57% higher mIoU accuracy, 32.9% fewer parameters, and 42.3% faster acceleration compared to SeaFormer-Base on the ADE20K dataset.

The success of transformers in computer vision has led to several attempts to adapt them for mobile devices, but their performance remains unsatisfactory in some real-world applications. To address this issue, we propose PP-MobileSeg, a semantic segmentation model that achieves state-of-the-art performance on mobile devices. PP-MobileSeg comprises three novel parts: the StrideFormer backbone, the Aggregated Attention Module (AAM), and the Valid Interpolate Module (VIM). The four-stage StrideFormer backbone is built with MV3 blocks and strided SEA attention, and it is able to extract rich semantic and detailed features with minimal parameter overhead. The AAM first filters the detailed features through semantic feature ensemble voting and then combines them with semantic features to enhance the semantic information. Furthermore, we proposed VIM to upsample the downsampled feature to the resolution of the input image. It significantly reduces model latency by only interpolating classes present in the final prediction, which is the most significant contributor to overall model latency. Extensive experiments show that PP-MobileSeg achieves a superior tradeoff between accuracy, model size, and latency compared to other methods. On the ADE20K dataset, PP-MobileSeg achieves 1.57% higher accuracy in mIoU than SeaFormer-Base with 32.9% fewer parameters and 42.3% faster acceleration on Qualcomm Snapdragon 855. Source codes are available at https://github.com/PaddlePaddle/PaddleSeg/tree/release/2.8.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes