CVOct 10, 2025

Dynamic Weight-based Temporal Aggregation for Low-light Video Enhancement

arXiv:2510.09450v1h-index: 23
Originality Incremental advance
AI Analysis

This work improves low-light video enhancement for real-world applications, representing an incremental advancement in leveraging temporal cues.

The paper tackled low-light video enhancement by addressing noise and temporal information limitations, resulting in DWTA-Net, a two-stage framework that suppresses noise and artifacts to achieve superior visual quality compared to state-of-the-art methods.

Low-light video enhancement (LLVE) is challenging due to noise, low contrast, and color degradations. Learning-based approaches offer fast inference but still struggle with heavy noise in real low-light scenes, primarily due to limitations in effectively leveraging temporal information. In this paper, we address this issue with DWTA-Net, a novel two-stage framework that jointly exploits short- and long-term temporal cues. Stage I employs Visual State-Space blocks for multi-frame alignment, recovering brightness, color, and structure with local consistency. Stage II introduces a recurrent refinement module with dynamic weight-based temporal aggregation guided by optical flow, adaptively balancing static and dynamic regions. A texture-adaptive loss further preserves fine details while promoting smoothness in flat areas. Experiments on real-world low-light videos show that DWTA-Net effectively suppresses noise and artifacts, delivering superior visual quality compared with state-of-the-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes