CV LGDec 23, 2020

On Calibration of Scene-Text Recognition Models

Ron Slossberg, Oron Anschel, Amir Markovitz, Ron Litman, Aviad Aberdam, Shahar Tsiper, Shai Mazor, Jon Wu, R. Manmatha

arXiv:2012.12643v18.516 citations

Originality Incremental advance

AI Analysis

This work addresses the problem of overconfidence in scene-text recognition models, which is important for applications requiring reliable confidence scores for text predictions.

This paper investigates word-level confidence calibration for scene-text recognition (STR) models, revealing that existing STR methods are consistently overconfident. The authors demonstrate that character-level calibration can increase word-level calibration error and propose sequence-based extensions to existing calibration methods, reducing calibration error by up to a factor of nearly 7 and improving accuracy when used with beam-search.

In this work, we study the problem of word-level confidence calibration for scene-text recognition (STR). Although the topic of confidence calibration has been an active research area for the last several decades, the case of structured and sequence prediction calibration has been scarcely explored. We analyze several recent STR methods and show that they are consistently overconfident. We then focus on the calibration of STR models on the word rather than the character level. In particular, we demonstrate that for attention based decoders, calibration of individual character predictions increases word-level calibration error compared to an uncalibrated model. In addition, we apply existing calibration methodologies as well as new sequence-based extensions to numerous STR models, demonstrating reduced calibration error by up to a factor of nearly 7. Finally, we show consistently improved accuracy results by applying our proposed sequence calibration method as a preprocessing step to beam-search.

View on arXiv PDF

Similar