CVAIDec 4, 2025

Reflection Removal through Efficient Adaptation of Diffusion Transformers

arXiv:2512.05000v11 citationsh-index: 1
Originality Incremental advance
AI Analysis

This provides a scalable and high-fidelity solution for removing reflections in images, which is an incremental improvement for computer vision applications like photography or surveillance.

The paper tackled single-image reflection removal by adapting a pre-trained diffusion transformer with efficient LoRA-based methods and synthetic data from a physically based rendering pipeline, achieving state-of-the-art performance on in-domain and zero-shot benchmarks.

We introduce a diffusion-transformer (DiT) framework for single-image reflection removal that leverages the generalization strengths of foundation diffusion models in the restoration setting. Rather than relying on task-specific architectures, we repurpose a pre-trained DiT-based foundation model by conditioning it on reflection-contaminated inputs and guiding it toward clean transmission layers. We systematically analyze existing reflection removal data sources for diversity, scalability, and photorealism. To address the shortage of suitable data, we construct a physically based rendering (PBR) pipeline in Blender, built around the Principled BSDF, to synthesize realistic glass materials and reflection effects. Efficient LoRA-based adaptation of the foundation model, combined with the proposed synthetic data, achieves state-of-the-art performance on in-domain and zero-shot benchmarks. These results demonstrate that pretrained diffusion transformers, when paired with physically grounded data synthesis and efficient adaptation, offer a scalable and high-fidelity solution for reflection removal. Project page: https://hf.co/spaces/huawei-bayerlab/windowseat-reflection-removal-web

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes