CVMay 28, 2018

Confidence Prediction for Lexicon-Free OCR

arXiv:1805.11161v19.621 citations

Originality Incremental advance

AI Analysis

This addresses the need for error filtering in OCR applications where lexicons are not available, though it appears incremental as it builds on existing confidence measurement techniques.

The paper tackled the problem of reliable confidence scoring for lexicon-free OCR to reduce false readings, achieving a significant reduction in misreads on standard benchmarks and a proprietary dataset.

Having a reliable accuracy score is crucial for real world applications of OCR, since such systems are judged by the number of false readings. Lexicon-based OCR systems, which deal with what is essentially a multi-class classification problem, often employ methods explicitly taking into account the lexicon, in order to improve accuracy. However, in lexicon-free scenarios, filtering errors requires an explicit confidence calculation. In this work we show two explicit confidence measurement techniques, and show that they are able to achieve a significant reduction in misreads on both standard benchmarks and a proprietary dataset.

View on arXiv PDF

Similar