Decomposition-Based Domain Adaptation for Real-World Font Recognition
This addresses the problem of poor generalization to real-world data in font recognition for applications like document analysis, though it is incremental as it builds on existing domain adaptation techniques.
The paper tackles the domain mismatch between synthetic training and real-world testing data in font recognition by introducing a decomposition-based domain adaptation framework, achieving over 80% top-5 accuracy on a new real-world dataset.
We present a domain adaption framework to address a domain mismatch between synthetic training and real-world testing data. We demonstrate our method on a challenging fine-grain classification problem: recognizing a font style from an image of text. In this task, it is very easy to generate lots of rendered font examples but very hard to obtain real-world labeled images. This real-to-synthetic domain gap caused poor generalization to new real data in previous font recognition methods (Chen et al. (2014)). In this paper, we introduce a Convolutional Neural Network decomposition approach, leveraging a large training corpus of synthetic data to obtain effective features for classification. This is done using an adaptation technique based on a Stacked Convolutional Auto-Encoder that exploits a large collection of unlabeled real-world text images combined with synthetic data preprocessed in a specific way. The proposed DeepFont method achieves an accuracy of higher than 80% (top-5) on a new large labeled real-world dataset we collected.