Stratified Domain Adaptation: A Progressive Self-Training Approach for Scene Text Recognition
This work addresses domain adaptation challenges in scene text recognition, an incremental advancement for improving model robustness across different visual domains.
The paper tackles the problem of performance degradation in unsupervised domain adaptation for scene text recognition when there is a large domain gap, by introducing a Stratified Domain Adaptation approach that partitions training data into subsets for progressive self-training, resulting in significant improvements over baseline models as shown in experiments on benchmark datasets.
Unsupervised domain adaptation (UDA) has become increasingly prevalent in scene text recognition (STR), especially where training and testing data reside in different domains. The efficacy of existing UDA approaches tends to degrade when there is a large gap between the source and target domains. To deal with this problem, gradually shifting or progressively learning to shift from domain to domain is the key issue. In this paper, we introduce the Stratified Domain Adaptation (StrDA) approach, which examines the gradual escalation of the domain gap for the learning process. The objective is to partition the training data into subsets so that the progressively self-trained model can adapt to gradual changes. We stratify the training data by evaluating the proximity of each data sample to both the source and target domains. We propose a novel method for employing domain discriminators to estimate the out-of-distribution and domain discriminative levels of data samples. Extensive experiments on benchmark scene-text datasets show that our approach significantly improves the performance of baseline (source-trained) STR models.