CVJan 21, 2019

Generating Text Sequence Images for Recognition

arXiv:1901.06782v11 citations
Originality Incremental advance
AI Analysis

This work addresses a data scarcity issue for researchers and practitioners in text recognition, but it is incremental as it builds on existing synthesis methods by simplifying the process.

The paper tackles the problem of insufficient labeled text sequence images for deep learning-based text recognition by proposing a method to generate infinite training data without auxiliary pre/post-processing, using conditional adversarial networks for image-to-image translation, with results showing satisfactory data quality.

Recently, methods based on deep learning have dominated the field of text recognition. With a large number of training data, most of them can achieve the state-of-the-art performances. However, it is hard to harvest and label sufficient text sequence images from the real scenes. To mitigate this issue, several methods to synthesize text sequence images were proposed, yet they usually need complicated preceding or follow-up steps. In this work, we present a method which is able to generate infinite training data without any auxiliary pre/post-process. We tackle the generation task as an image-to-image translation one and utilize conditional adversarial networks to produce realistic text sequence images in the light of the semantic ones. Some evaluation metrics are involved to assess our method and the results demonstrate that the caliber of the data is satisfactory. The code and dataset will be publicly available soon.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes