CV LGNov 13, 2015

Similarity-based Text Recognition by Deeply Supervised Siamese Network

arXiv:1511.04397v51.3

Originality Incremental advance

AI Analysis

This work addresses text recognition for machine-printed and handwritten data, offering incremental improvements over conventional Siamese networks.

The paper tackles text recognition by using a deeply supervised Siamese network to measure visual similarity and predict unlabeled text content, achieving an error rate below 0.5% and reducing human estimation costs by 50%-85%.

In this paper, we propose a new text recognition model based on measuring the visual similarity of text and predicting the content of unlabeled texts. First a Siamese convolutional network is trained with deep supervision on a labeled training dataset. This network projects texts into a similarity manifold. The Deeply Supervised Siamese network learns visual similarity of texts. Then a K-nearest neighbor classifier is used to predict unlabeled text based on similarity distance to labeled texts. The performance of the model is evaluated on three datasets of machine-print and hand-written text combined. We demonstrate that the model reduces the cost of human estimation by $50\%-85\%$. The error of the system is less than $0.5\%$. The proposed model outperform conventional Siamese network by finding visually-similar barely-readable and readable text, e.g. machine-printed, handwritten, due to deep supervision. The results also demonstrate that the predicted labels are sometimes better than human labels e.g. spelling correction.

View on arXiv PDF

Similar