CVJan 10, 2019

A Multi-Object Rectified Attention Network for Scene Text Recognition

arXiv:1901.03003v1302 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses the challenge of scene text recognition for applications like document analysis and image understanding, but it is incremental as it builds on existing attention-based methods.

The paper tackles the problem of recognizing irregular scene text with various shapes and distorted patterns by proposing a multi-object rectified attention network (MORAN), which achieves state-of-the-art performance on various benchmarks.

Irregular text is widely used. However, it is considerably difficult to recognize because of its various shapes and distorted patterns. In this paper, we thus propose a multi-object rectified attention network (MORAN) for general scene text recognition. The MORAN consists of a multi-object rectification network and an attention-based sequence recognition network. The multi-object rectification network is designed for rectifying images that contain irregular text. It decreases the difficulty of recognition and enables the attention-based sequence recognition network to more easily read irregular text. It is trained in a weak supervision way, thus requiring only images and corresponding text labels. The attention-based sequence recognition network focuses on target characters and sequentially outputs the predictions. Moreover, to improve the sensitivity of the attention-based sequence recognition network, a fractional pickup method is proposed for an attention-based decoder in the training phase. With the rectification mechanism, the MORAN can read both regular and irregular scene text. Extensive experiments on various benchmarks are conducted, which show that the MORAN achieves state-of-the-art performance. The source code is available.

Code Implementations7 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes