BookNet: Book Image Rectification via Cross-Page Attention Network
This addresses the challenge of rectifying distorted book images for document processing applications, representing a domain-specific advancement.
The paper tackles the problem of book image rectification by introducing BookNet, a dual-branch deep learning framework with cross-page attention that explicitly models interactions between left and right pages, and it outperforms existing state-of-the-art methods as demonstrated in experiments.
Book image rectification presents unique challenges in document image processing due to complex geometric distortions from binding constraints, where left and right pages exhibit distinctly asymmetric curvature patterns. However, existing single-page document image rectification methods fail to capture the coupled geometric relationships between adjacent pages in books. In this work, we introduce BookNet, the first end-to-end deep learning framework specifically designed for dual-page book image rectification. BookNet adopts a dual-branch architecture with cross-page attention mechanisms, enabling it to estimate warping flows for both individual pages and the complete book spread, explicitly modeling how left and right pages influence each other. Moreover, to address the absence of specialized datasets, we present Book3D, a large-scale synthetic dataset for training, and Book100, a comprehensive real-world benchmark for evaluation. Extensive experiments demonstrate that BookNet outperforms existing state-of-the-art methods on book image rectification. Code and dataset will be made publicly available.