Low-Light Video Enhancement via Spatial-Temporal Consistent Decomposition
This work addresses the problem of restoring visibility and reducing noise in low-light videos, which is important for applications like surveillance and autonomous driving, but it appears incremental as it builds on existing decomposition strategies.
The paper tackles low-light video enhancement by proposing a spatial-temporal consistent decomposition method that uses view-independent and view-dependent components with cross-frame correspondences and continuity constraints, achieving state-of-the-art performance on benchmarks.
Low-Light Video Enhancement (LLVE) seeks to restore dynamic or static scenes plagued by severe invisibility and noise. In this paper, we present an innovative video decomposition strategy that incorporates view-independent and view-dependent components to enhance the performance of LLVE. We leverage dynamic cross-frame correspondences for the view-independent term (which primarily captures intrinsic appearance) and impose a scene-level continuity constraint on the view-dependent term (which mainly describes the shading condition) to achieve consistent and satisfactory decomposition results. To further ensure consistent decomposition, we introduce a dual-structure enhancement network featuring a cross-frame interaction mechanism. By supervising different frames simultaneously, this network encourages them to exhibit matching decomposition features. This mechanism can seamlessly integrate with encoder-decoder single-frame networks, incurring minimal additional parameter costs. Extensive experiments are conducted on widely recognized LLVE benchmarks, covering diverse scenarios. Our framework consistently outperforms existing methods, establishing a new SOTA performance.