CVNov 27, 2019

Methods of Weighted Combination for Text Field Recognition in a Video Stream

arXiv:1911.12028v16 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the need for better document recognition on mobile devices, where image distortions are common, by leveraging video input, but it appears incremental as it builds on existing recognition methods.

The paper tackled the problem of improving text field recognition from video streams on mobile devices by proposing a weighted combination method for results from multiple frames, concluding that this approach is appropriate for enhancing recognition quality.

Due to a noticeable expansion of document recognition applicability, there is a high demand for recognition on mobile devices. A mobile camera, unlike a scanner, cannot always ensure the absence of various image distortions, therefore the task of improving the recognition precision is relevant. The advantage of mobile devices over scanners is the ability to use video stream input, which allows to get multiple images of a recognized document. Despite this, not enough attention is currently paid to the issue of combining recognition results obtained from different frames when using video stream input. In this paper we propose a weighted text string recognition results combination method and weighting criteria, and provide experimental data for verifying their validity and effectiveness. Based on the obtained results, it is concluded that the use of such weighted combination is appropriate for improving the quality of the video stream recognition result.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes