ShaDocNet: Learning Spatial-Aware Tokens in Transformer for Document Shadow Removal
This addresses the problem of improving visual quality and legibility in digital document copies, but it is incremental as it builds on existing neural network approaches.
The paper tackles document shadow removal by proposing a Transformer-based model that uses shadow context encoding and decoding, along with shadow detection and pixel-level enhancement in a coarse-to-fine process, achieving competitive performance with state-of-the-art methods.
Shadow removal improves the visual quality and legibility of digital copies of documents. However, document shadow removal remains an unresolved subject. Traditional techniques rely on heuristics that vary from situation to situation. Given the quality and quantity of current public datasets, the majority of neural network models are ill-equipped for this task. In this paper, we propose a Transformer-based model for document shadow removal that utilizes shadow context encoding and decoding in both shadow and shadow-free regions. Additionally, shadow detection and pixel-level enhancement are included in the whole coarse-to-fine process. On the basis of comprehensive benchmark evaluations, it is competitive with state-of-the-art methods.