CVNov 19, 2020

Scene text removal via cascaded text stroke detection and erasing

arXiv:2011.09768v136 citations
AI Analysis

This work provides a more effective solution for scene text removal, which is beneficial for applications requiring clean images, such as image editing and augmented reality.

This paper addresses the problem of removing scene text by proposing a cascaded framework that first detects text strokes and then erases them. The method significantly outperforms state-of-the-art approaches in locating and erasing scene text, and also introduces a new real-world dataset for evaluation.

Recent learning-based approaches show promising performance improvement for scene text removal task. However, these methods usually leave some remnants of text and obtain visually unpleasant results. In this work, we propose a novel "end-to-end" framework based on accurate text stroke detection. Specifically, we decouple the text removal problem into text stroke detection and stroke removal. We design a text stroke detection network and a text removal generation network to solve these two sub-problems separately. Then, we combine these two networks as a processing unit, and cascade this unit to obtain the final model for text removal. Experimental results demonstrate that the proposed method significantly outperforms the state-of-the-art approaches for locating and erasing scene text. Since current publicly available datasets are all synthetic and cannot properly measure the performance of different methods, we therefore construct a new real-world dataset, which will be released to facilitate the relevant research.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes