CVAIDec 9, 2025

MatteViT: High-Frequency-Aware Document Shadow Removal with Shadow Matte Guidance

arXiv:2512.08789v1h-index: 1
Originality Incremental advance
AI Analysis

This work addresses the problem of enhancing digitized document clarity for applications like OCR, though it is incremental as it builds on existing shadow removal methods with novel components.

The paper tackled document shadow removal by proposing MatteViT, a framework that uses high-frequency amplification and shadow matte guidance to eliminate shadows while preserving fine details, achieving state-of-the-art performance on benchmarks like RDD and Kligler and improving optical character recognition accuracy.

Document shadow removal is essential for enhancing the clarity of digitized documents. Preserving high-frequency details (e.g., text edges and lines) is critical in this process because shadows often obscure or distort fine structures. This paper proposes a matte vision transformer (MatteViT), a novel shadow removal framework that applies spatial and frequency-domain information to eliminate shadows while preserving fine-grained structural details. To effectively retain these details, we employ two preservation strategies. First, our method introduces a lightweight high-frequency amplification module (HFAM) that decomposes and adaptively amplifies high-frequency components. Second, we present a continuous luminance-based shadow matte, generated using a custom-built matte dataset and shadow matte generator, which provides precise spatial guidance from the earliest processing stage. These strategies enable the model to accurately identify fine-grained regions and restore them with high fidelity. Extensive experiments on public benchmarks (RDD and Kligler) demonstrate that MatteViT achieves state-of-the-art performance, providing a robust and practical solution for real-world document shadow removal. Furthermore, the proposed method better preserves text-level details in downstream tasks, such as optical character recognition, improving recognition performance over prior methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes