CVLGNov 13, 2015

Similarity-based Text Recognition by Deeply Supervised Siamese Network

arXiv:1511.04397v5
Originality Incremental advance
AI Analysis

This work addresses text recognition for machine-printed and handwritten data, offering incremental improvements over conventional Siamese networks.

The paper tackles text recognition by using a deeply supervised Siamese network to measure visual similarity and predict unlabeled text content, achieving an error rate below 0.5% and reducing human estimation costs by 50%-85%.

In this paper, we propose a new text recognition model based on measuring the visual similarity of text and predicting the content of unlabeled texts. First a Siamese convolutional network is trained with deep supervision on a labeled training dataset. This network projects texts into a similarity manifold. The Deeply Supervised Siamese network learns visual similarity of texts. Then a K-nearest neighbor classifier is used to predict unlabeled text based on similarity distance to labeled texts. The performance of the model is evaluated on three datasets of machine-print and hand-written text combined. We demonstrate that the model reduces the cost of human estimation by $50\%-85\%$. The error of the system is less than $0.5\%$. The proposed model outperform conventional Siamese network by finding visually-similar barely-readable and readable text, e.g. machine-printed, handwritten, due to deep supervision. The results also demonstrate that the predicted labels are sometimes better than human labels e.g. spelling correction.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes