CVAIOct 31, 2021

A Simple Approach to Image Tilt Correction with Self-Attention MobileNet for Smartphones

arXiv:2111.00398v13 citations
Originality Incremental advance
AI Analysis

This work addresses image tilt detection for mobile devices, offering an incremental improvement over existing models.

The paper tackles image tilt correction for smartphones by proposing a Self-Attention MobileNet (SA-MobileNet) and a novel training pipeline, achieving state-of-the-art results with accuracy improvements of 6.42% to 10.51% over MobileNetV3 on multiple datasets and faster inference by at least 4 milliseconds.

The main contributions of our work are two-fold. First, we present a Self-Attention MobileNet, called SA-MobileNet Network that can model long-range dependencies between the image features instead of processing the local region as done by standard convolutional kernels. SA-MobileNet contains self-attention modules integrated with the inverted bottleneck blocks of the MobileNetV3 model which results in modeling of both channel-wise attention and spatial attention of the image features and at the same time introduce a novel self-attention architecture for low-resource devices. Secondly, we propose a novel training pipeline for the task of image tilt detection. We treat this problem in a multi-label scenario where we predict multiple angles for a tilted input image in a narrow interval of range 1-2 degrees, depending on the dataset used. This process induces an implicit correlation between labels without any computational overhead of the second or higher-order methods in multi-label learning. With the combination of our novel approach and the architecture, we present state-of-the-art results on detecting the image tilt angle on mobile devices as compared to the MobileNetV3 model. Finally, we establish that SA-MobileNet is more accurate than MobileNetV3 on SUN397, NYU-V1, and ADE20K datasets by 6.42%, 10.51%, and 9.09% points respectively, and faster by at least 4 milliseconds on Snapdragon 750 Octa-core.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes