CVNov 5, 2019

Improving Long Handwritten Text Line Recognition with Convolutional Multi-way Associative Memory

arXiv:1911.01577v27 citations
AI Analysis

This work addresses a bottleneck in OCR for scanned documents, offering an incremental improvement over existing methods.

The paper tackled the problem of vanishing/exploding gradients in Convolutional Recurrent Neural Networks (CRNNs) when processing long handwritten text lines, which hinders Optical Character Recognition (OCR). It introduced a Convolutional Multi-way Associative Memory (CMAM) architecture that demonstrated superior performance against other CRNNs on three real-world long text OCR datasets.

Convolutional Recurrent Neural Networks (CRNNs) excel at scene text recognition. Unfortunately, they are likely to suffer from vanishing/exploding gradient problems when processing long text images, which are commonly found in scanned documents. This poses a major challenge to goal of completely solving Optical Character Recognition (OCR) problem. Inspired by recently proposed memory-augmented neural networks (MANNs) for long-term sequential modeling, we present a new architecture dubbed Convolutional Multi-way Associative Memory (CMAM) to tackle the limitation of current CRNNs. By leveraging recent memory accessing mechanisms in MANNs, our architecture demonstrates superior performance against other CRNN counterparts in three real-world long text OCR datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes