Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network
This addresses the need for improved recognition performance in camera-based documents by providing an effective dewarping solution, though it appears incremental as it builds on existing displacement estimation methods.
The paper tackles the problem of rectifying distorted document images from cameras by proposing a framework that uses a fully convolutional network to estimate pixel-wise displacements and apply transformations, achieving state-of-the-art performance in local details and overall effect.
As camera-based documents are increasingly used, the rectification of distorted document images becomes a need to improve the recognition performance. In this paper, we propose a novel framework for both rectifying distorted document image and removing background finely, by estimating pixel-wise displacements using a fully convolutional network (FCN). The document image is rectified by transformation according to the displacements of pixels. The FCN is trained by regressing displacements of synthesized distorted documents, and to control the smoothness of displacements, we propose a Local Smooth Constraint (LSC) in regularization. Our approach is easy to implement and consumes moderate computing resource. Experiments proved that our approach can dewarp document images effectively under various geometric distortions, and has achieved the state-of-the-art performance in terms of local details and overall effect.