CVJul 23, 2019

GA-DAN: Geometry-Aware Domain Adaptation Network for Scene Text Detection and Recognition

arXiv:1907.09653v189 citations
Originality Incremental advance
AI Analysis

This work addresses domain adaptation for scene text tasks, offering a novel method that improves detection and recognition, but it is incremental as it builds on existing adversarial learning approaches.

The paper tackles the problem of cross-domain shifts in both geometry and appearance spaces for scene text detection and recognition by introducing GA-DAN, which uses a multi-modal spatial learning technique and disentangled cycle-consistency loss, resulting in superior performance with domain-adapted images in experiments.

Recent adversarial learning research has achieved very impressive progress for modelling cross-domain data shifts in appearance space but its counterpart in modelling cross-domain shifts in geometry space lags far behind. This paper presents an innovative Geometry-Aware Domain Adaptation Network (GA-DAN) that is capable of modelling cross-domain shifts concurrently in both geometry space and appearance space and realistically converting images across domains with very different characteristics. In the proposed GA-DAN, a novel multi-modal spatial learning technique is designed which converts a source-domain image into multiple images of different spatial views as in the target domain. A new disentangled cycle-consistency loss is introduced which balances the cycle consistency in appearance and geometry spaces and improves the learning of the whole network greatly. The proposed GA-DAN has been evaluated for the classic scene text detection and recognition tasks, and experiments show that the domain-adapted images achieve superior scene text detection and recognition performance while applied to network training.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes