SwapText: Image Based Texts Transfer in Scenes
This addresses the challenge of text transfer in images for applications like translation and synthesis, but appears incremental as it builds on existing scene text manipulation methods.
The authors tackled the problem of swapping text in scene images while preserving original visual attributes like fonts and backgrounds, and presented SwapText, a three-stage framework that achieved manipulation of texts even with severe geometric distortion.
Swapping text in scene images while preserving original fonts, colors, sizes and background textures is a challenging task due to the complex interplay between different factors. In this work, we present SwapText, a three-stage framework to transfer texts across scene images. First, a novel text swapping network is proposed to replace text labels only in the foreground image. Second, a background completion network is learned to reconstruct background images. Finally, the generated foreground image and background image are used to generate the word image by the fusion network. Using the proposing framework, we can manipulate the texts of the input images even with severe geometric distortion. Qualitative and quantitative results are presented on several scene text datasets, including regular and irregular text datasets. We conducted extensive experiments to prove the usefulness of our method such as image based text translation, text image synthesis, etc.