CVAIGRJan 12

SIRR-LMM: Single-image Reflection Removal via Large Multimodal Model

arXiv:2601.07209v1h-index: 3
Originality Incremental advance
AI Analysis

This addresses the challenge of removing reflections from glass surfaces in images, which is important for applications like photography and computer vision, but it appears incremental as it builds on existing LMM capabilities with a new dataset and fine-tuning approach.

The paper tackled the problem of single-image reflection removal by introducing a synthetic dataset generation framework using path-traced 3D glass models over real backgrounds and fine-tuning a Large Multimodal Model with task-specific LoRA, achieving improved performance compared to state-of-the-art methods.

Glass surfaces create complex interactions of reflected and transmitted light, making single-image reflection removal (SIRR) challenging. Existing datasets suffer from limited physical realism in synthetic data or insufficient scale in real captures. We introduce a synthetic dataset generation framework that path-traces 3D glass models over real background imagery to create physically accurate reflection scenarios with varied glass properties, camera settings, and post-processing effects. To leverage the capabilities of Large Multimodal Model (LMM), we concatenate the image layers into a single composite input, apply joint captioning, and fine-tune the model using task-specific LoRA rather than full-parameter training. This enables our approach to achieve improved reflection removal and separation performance compared to state-of-the-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes