CVJan 23, 2016

Using compatible shape descriptor for lexicon reduction of printed Farsi subwords

arXiv:1601.06251v11 citations
Originality Synthesis-oriented
AI Analysis

This work addresses a domain-specific challenge in Persian text recognition, presenting an incremental improvement for handling varied subword shapes.

The paper tackles the problem of lexicon reduction for printed Farsi subwords by proposing a method that selects shape descriptors based on input characteristics using a neural network, achieving effectiveness as demonstrated on a Persian subword dataset.

This Paper presents a method for lexicon reduction of Printed Farsi subwords based on their holistic shape features. Because of the large number of Persian subwords variously shaped from a simple letter to a complex combination of several connected characters, it is not easy to find a fixed shape descriptor suitable for all subwords. In this paper, we propose to select the descriptor according to the input shape characteristics. To do this, a neural network is trained to predict the appropriate descriptor of the input image. This network is implemented in the proposed lexicon reduction system to decide on the descriptor used for comparison of the query image with the lexicon entries. Evaluating the proposed method on a dataset of Persian subwords allows one to attest the effectiveness of the proposed idea of dealing differently with various query shapes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes