CVFeb 21, 2023

A3S: Adversarial learning of semantic representations for Scene-Text Spotting

arXiv:2302.10641v19 citationsh-index: 7
Originality Incremental advance
AI Analysis

This work addresses the challenge of accurate text recognition in natural scene images for applications like document analysis, though it appears incremental by building on existing detection methods.

The paper tackles the problem of insufficient end-to-end accuracy in scene-text spotting by proposing A3S, which uses adversarial learning of semantic representations to improve text recognition, achieving better accuracy than other methods on public datasets.

Scene-text spotting is a task that predicts a text area on natural scene images and recognizes its text characters simultaneously. It has attracted much attention in recent years due to its wide applications. Existing research has mainly focused on improving text region detection, not text recognition. Thus, while detection accuracy is improved, the end-to-end accuracy is insufficient. Texts in natural scene images tend to not be a random string of characters but a meaningful string of characters, a word. Therefore, we propose adversarial learning of semantic representations for scene text spotting (A3S) to improve end-to-end accuracy, including text recognition. A3S simultaneously predicts semantic features in the detected text area instead of only performing text recognition based on existing visual features. Experimental results on publicly available datasets show that the proposed method achieves better accuracy than other methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes