CVMay 11

M2Retinexformer: Multi-Modal Retinexformer for Low-Light Image Enhancement

arXiv:2605.1255650.1Has Code
Predicted impact top 69% in CV · last 90 daysOriginality Synthesis-oriented
AI Analysis

For low-light image enhancement, this work incrementally improves upon existing Retinexformer by adding multi-modal cues.

M2Retinexformer extends Retinexformer by incorporating depth, luminance, and semantic cues via cross-attention and adaptive gating, achieving overall improvements over state-of-the-art methods on LOL, SID, SMID, and SDSD benchmarks.

Low-light image enhancement is challenging due to complex degradations, including amplified noise, artifacts, and color distortion. While Retinex-based deep learning methods have achieved promising results, they primarily rely on single-modality RGB information. We propose M2Retinexformer (Multi-Modal Retinexformer), a novel framework that extends Retinexformer by incorporating depth cues, luminance priors, and semantic features within a progressive refinement pipeline. Depth provides geometric context that is invariant to lighting variations, while luminance and semantic features offer explicit guidance on brightness distribution and scene understanding. Modalities are extracted at multiple scales and fused through cross-attention, with adaptive gating dynamically balancing illumination-guided self-attention and cross-attention based on the reliability of auxiliary cues. Evaluations on the LOL, SID, SMID, and SDSD benchmarks demonstrate overall improvements over Retinexformer and recent state-of-the-art methods. Code and pretrained weights are available at https://github.com/YoussefAboelwafa/M2Retinexformer

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes