Realistic text replacement with non-uniform style conditioning
This work addresses the challenge of seamlessly replacing text in images for applications like editing or forgery detection, representing an incremental improvement over prior methods.
The paper tackles the problem of realistic text replacement in images by developing a novel non-uniform style conditioning layer applied to an encoder-decoder ResNet architecture, achieving results that outperform existing approaches on the ICDAR MLT benchmark.
In this work, we study the possibility of realistic text replacement, the goal of which is to replace text present in the image with user-supplied text. The replacement should be performed in a way that will not allow distinguishing the resulting image from the original one. We achieve this goal by developing a novel non-uniform style conditioning layer and apply it to an encoder-decoder ResNet based architecture. The resulting model is a single-stage model, with no post-processing. The proposed model achieves realistic text replacement and outperforms existing approaches on ICDAR MLT.