Confidence Score for Unsupervised Foreground Background Separation of Document Images
This addresses the need for reliability in document image analysis, but it is incremental as it builds on existing binarization methods.
The paper tackles the problem of assessing confidence in unsupervised foreground-background separation for document images by proposing a novel approach to compute confidence scores for classification in binarization algorithms like Sauvola's, with results demonstrating utility in applications such as document binarization, cleanup, and texture addition.
Foreground-background separation is an important problem in document image analysis. Popular unsupervised binarization methods (such as the Sauvola's algorithm) employ adaptive thresholding to classify pixels as foreground or background. In this work, we propose a novel approach for computing confidence scores of the classification in such algorithms. This score provides an insight of the confidence level of the prediction. The computational complexity of the proposed approach is the same as the underlying binarization algorithm. Our experiments illustrate the utility of the proposed scores in various applications like document binarization, document image cleanup, and texture addition.