CVApr 5, 2022

Text Spotting Transformers

arXiv:2204.01918v1128 citationsh-index: 75
Originality Highly original
AI Analysis

This addresses the problem of accurately spotting curved and arbitrarily shaped text in real-world images for computer vision applications, representing a novel method rather than an incremental improvement.

The paper tackles text detection and recognition in the wild by proposing TESTR, an end-to-end Transformer-based framework that jointly performs text-box control point regression and character recognition without Region-of-Interest operations or heuristic post-processing. Experiments show state-of-the-art performance on curved and arbitrarily shaped text datasets.

In this paper, we present TExt Spotting TRansformers (TESTR), a generic end-to-end text spotting framework using Transformers for text detection and recognition in the wild. TESTR builds upon a single encoder and dual decoders for the joint text-box control point regression and character recognition. Other than most existing literature, our method is free from Region-of-Interest operations and heuristics-driven post-processing procedures; TESTR is particularly effective when dealing with curved text-boxes where special cares are needed for the adaptation of the traditional bounding-box representations. We show our canonical representation of control points suitable for text instances in both Bezier curve and polygon annotations. In addition, we design a bounding-box guided polygon detection (box-to-polygon) process. Experiments on curved and arbitrarily shaped datasets demonstrate state-of-the-art performances of the proposed TESTR algorithm.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes