CVApr 12, 2016

Scan, Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention

Théodore Bluche, Jérôme Louradour, Ronaldo Messina

arXiv:1604.03286v319.2189 citations

Originality Highly original

AI Analysis

This addresses the challenge of automating document digitization for archives and libraries by eliminating the need for manual line segmentation, though it is an incremental advancement building on attention models from other domains.

The authors tackled the problem of transcribing handwritten paragraphs without prior segmentation into lines, achieving the first successful end-to-end multi-line handwriting recognition system with encouraging results on the IAM Database.

We present an attention-based model for end-to-end handwriting recognition. Our system does not require any segmentation of the input paragraph. The model is inspired by the differentiable attention models presented recently for speech recognition, image captioning or translation. The main difference is the covert and overt attention, implemented as a multi-dimensional LSTM network. Our principal contribution towards handwriting recognition lies in the automatic transcription without a prior segmentation into lines, which was crucial in previous approaches. To the best of our knowledge this is the first successful attempt of end-to-end multi-line handwriting recognition. We carried out experiments on the well-known IAM Database. The results are encouraging and bring hope to perform full paragraph transcription in the near future.

View on arXiv PDF

Similar