CVNov 12, 2022

MSLKANet: A Multi-Scale Large Kernel Attention Network for Scene Text Removal

arXiv:2211.06565v12 citationsh-index: 2
Originality Incremental advance
AI Analysis

This addresses privacy protection and text editing in natural images, but it is incremental as it builds on prior deep learning methods by focusing on large receptive fields.

The paper tackles scene text removal in full images by proposing MSLKANet, which uses multi-scale large kernel attention and large kernel spatial pyramid pooling to capture global information, achieving state-of-the-art performance on synthetic and real-world datasets.

Scene text removal aims to remove the text and fill the regions with perceptually plausible background information in natural images. It has attracted increasing attention due to its various applications in privacy protection, scene text retrieval, and text editing. With the development of deep learning, the previous methods have achieved significant improvements. However, most of the existing methods seem to ignore the large perceptive fields and global information. The pioneer method can get significant improvements by only changing training data from the cropped image to the full image. In this paper, we present a single-stage multi-scale network MSLKANet for scene text removal in full images. For obtaining large perceptive fields and global information, we propose multi-scale large kernel attention (MSLKA) to obtain long-range dependencies between the text regions and the backgrounds at various granularity levels. Furthermore, we combine the large kernel decomposition mechanism and atrous spatial pyramid pooling to build a large kernel spatial pyramid pooling (LKSPP), which can perceive more valid pixels in the spatial dimension while maintaining large receptive fields and low cost of computation. Extensive experimental results indicate that the proposed method achieves state-of-the-art performance on both synthetic and real-world datasets and the effectiveness of the proposed components MSLKA and LKSPP.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes