Lalita Kumari

h-index4

4papers

23citations

Novelty50%

AI Score26

Ranked #163,243 of 194,257 authors (top 84%)#52,598 in CV (top 89%)

4 Papers

5.7CVSep 11, 2022

Lexicon and Attention based Handwritten Text Recognition System

Lalita Kumari, Sukhdeep Singh, VVS Rathore et al.

The handwritten text recognition problem is widely studied by the researchers of computer vision community due to its scope of improvement and applicability to daily lives, It is a sub-domain of pattern recognition. Due to advancement of computational power of computers since last few decades neural networks based systems heavily contributed towards providing the state-of-the-art handwritten text recognizers. In the same direction, we have taken two state-of-the art neural networks systems and merged the attention mechanism with it. The attention technique has been widely used in the domain of neural machine translations and automatic speech recognition and now is being implemented in text recognition domain. In this study, we are able to achieve 4.15% character error rate and 9.72% word error rate on IAM dataset, 7.07% character error rate and 16.14% word error rate on GW dataset after merging the attention and word beam search decoder with existing Flor et al. architecture. To analyse further, we have also used system similar to Shi et al. neural network system with greedy decoder and observed 23.27% improvement in character error rate from the base model.

5.7CVMay 23, 2022

A Comprehensive Handwritten Paragraph Text Recognition System: LexiconNet

Lalita Kumari, Sukhdeep Singh, Vaibhav Varish Singh Rathore et al.

In this study, we have presented an efficient procedure using two state-of-the-art approaches from the literature of handwritten text recognition as Vertical Attention Network and Word Beam Search. The attention module is responsible for internal line segmentation that consequently processes a page in a line-by-line manner. At the decoding step, we have added a connectionist temporal classification-based word beam search decoder as a post-processing step. In this study, an end-to-end paragraph recognition system is presented with a lexicon decoder as a post-processing step. Our procedure reports state-of-the-art results on standard datasets. The reported character error rate is 3.24% on the IAM dataset with 27.19% improvement, 1.13% on RIMES with 40.83% improvement and 2.43% on the READ-16 dataset with 32.31% improvement from existing literature and the word error rate is 8.29% on IAM dataset with 43.02% improvement, 2.94% on RIMES dataset with 56.25% improvement and 7.35% on READ-2016 dataset with 47.27% improvement from the existing results. The character error rate and word error rate reported in this work surpass the results reported in the literature.

2.6CVJul 11, 2022

A Lexicon and Depth-wise Separable Convolution Based Handwritten Text Recognition System

Lalita Kumari, Sukhdeep Singh, VVS Rathore et al.

Cursive handwritten text recognition is a challenging research problem in the domain of pattern recognition. The current state-of-the-art approaches include models based on convolutional recurrent neural networks and multi-dimensional long short-term memory recurrent neural networks techniques. These methods are highly computationally extensive as well model is complex at design level. In recent studies, combination of convolutional neural network and gated convolutional neural networks based models demonstrated less number of parameters in comparison to convolutional recurrent neural networks based models. In the direction to reduced the total number of parameters to be trained, in this work, we have used depthwise convolution in place of standard convolutions with a combination of gated-convolutional neural network and bidirectional gated recurrent unit to reduce the total number of parameters to be trained. Additionally, we have also included a lexicon based word beam search decoder at testing step. It also helps in improving the the overall accuracy of the model. We have obtained 3.84% character error rate and 9.40% word error rate on IAM dataset; 4.88% character error rate and 14.56% word error rate in George Washington dataset, respectively.

2.0CVApr 22, 2024

GatedLexiconNet: A Comprehensive End-to-End Handwritten Paragraph Text Recognition System

Lalita Kumari, Sukhdeep Singh, Vaibhav Varish Singh Rathore et al.

The Handwritten Text Recognition problem has been a challenge for researchers for the last few decades, especially in the domain of computer vision, a subdomain of pattern recognition. Variability of texts amongst writers, cursiveness, and different font styles of handwritten texts with degradation of historical text images make it a challenging problem. Recognizing scanned document images in neural network-based systems typically involves a two-step approach: segmentation and recognition. However, this method has several drawbacks. These shortcomings encompass challenges in identifying text regions, analyzing layout diversity within pages, and establishing accurate ground truth segmentation. Consequently, these processes are prone to errors, leading to bottlenecks in achieving high recognition accuracies. Thus, in this study, we present an end-to-end paragraph recognition system that incorporates internal line segmentation and gated convolutional layers based encoder. The gating is a mechanism that controls the flow of information and allows to adaptively selection of the more relevant features in handwritten text recognition models. The attention module plays an important role in performing internal line segmentation, allowing the page to be processed line-by-line. During the decoding step, we have integrated a connectionist temporal classification-based word beam search decoder as a post-processing step. In this work, we have extended existing LexiconNet by carefully applying and utilizing gated convolutional layers in the existing deep neural network. Our results at line and page levels also favour our new GatedLexiconNet. This study reported character error rates of 2.27% on IAM, 0.9% on RIMES, and 2.13% on READ-16, and word error rates of 5.73% on IAM, 2.76% on RIMES, and 6.52% on READ-2016 datasets.