Fully Convolutional Networks for Handwriting Recognition
This work addresses the problem of recognizing handwritten text across Latin-based languages for applications in document digitization, offering an incremental improvement by simplifying the recognition pipeline while maintaining competitive performance.
The paper tackles handwritten text recognition by proposing a fully convolutional model that processes variable-length inputs to output symbols, eliminating the need for preprocessing like alignment correction and post-processing like CTC or language models. It achieves competitive results with state-of-the-art dictionary-based methods on IAM and RIMES datasets, and includes an attention mechanism to handle handwriting variations such as slant and noise.
Handwritten text recognition is challenging because of the virtually infinite ways a human can write the same message. Our fully convolutional handwriting model takes in a handwriting sample of unknown length and outputs an arbitrary stream of symbols. Our dual stream architecture uses both local and global context and mitigates the need for heavy preprocessing steps such as symbol alignment correction as well as complex post processing steps such as connectionist temporal classification, dictionary matching or language models. Using over 100 unique symbols, our model is agnostic to Latin-based languages, and is shown to be quite competitive with state of the art dictionary based methods on the popular IAM and RIMES datasets. When a dictionary is known, we further allow a probabilistic character error rate to correct errant word blocks. Finally, we introduce an attention based mechanism which can automatically target variants of handwriting, such as slant, stroke width, or noise.