MOSAIC: Mobile Segmentation via decoding Aggregated Information and encoded Context
This work addresses the problem of accurate and efficient image segmentation for mobile applications, representing an incremental improvement over existing methods.
The paper tackles efficient semantic image segmentation on mobile devices by introducing MOSAIC, a neural network architecture that achieves a 5% absolute accuracy gain over current standards while balancing computational cost.
We present a next-generation neural network architecture, MOSAIC, for efficient and accurate semantic image segmentation on mobile devices. MOSAIC is designed using commonly supported neural operations by diverse mobile hardware platforms for flexible deployment across various mobile platforms. With a simple asymmetric encoder-decoder structure which consists of an efficient multi-scale context encoder and a light-weight hybrid decoder to recover spatial details from aggregated information, MOSAIC achieves new state-of-the-art performance while balancing accuracy and computational cost. Deployed on top of a tailored feature extraction backbone based on a searched classification network, MOSAIC achieves a 5% absolute accuracy gain surpassing the current industry standard MLPerf models and state-of-the-art architectures.