CVMay 22, 2025

Semi-Supervised State-Space Model with Dynamic Stacking Filter for Real-World Video Deraining

arXiv:2505.16811v15 citationsh-index: 6CVPR
Originality Incremental advance
AI Analysis

This work addresses the challenge of generalizing video deraining to real-world conditions, which is incremental as it builds on existing deep learning methods but introduces novel components for better adaptation.

The authors tackled the problem of video deraining in real-world scenarios by proposing a dual-branch spatio-temporal state-space model with a dynamic stacking filter and semi-supervised learning, achieving superior performance in quantitative metrics, visual quality, and efficiency across multiple benchmarks.

Significant progress has been made in video restoration under rainy conditions over the past decade, largely propelled by advancements in deep learning. Nevertheless, existing methods that depend on paired data struggle to generalize effectively to real-world scenarios, primarily due to the disparity between synthetic and authentic rain effects. To address these limitations, we propose a dual-branch spatio-temporal state-space model to enhance rain streak removal in video sequences. Specifically, we design spatial and temporal state-space model layers to extract spatial features and incorporate temporal dependencies across frames, respectively. To improve multi-frame feature fusion, we derive a dynamic stacking filter, which adaptively approximates statistical filters for superior pixel-wise feature refinement. Moreover, we develop a median stacking loss to enable semi-supervised learning by generating pseudo-clean patches based on the sparsity prior of rain. To further explore the capacity of deraining models in supporting other vision-based tasks in rainy environments, we introduce a novel real-world benchmark focused on object detection and tracking in rainy conditions. Our method is extensively evaluated across multiple benchmarks containing numerous synthetic and real-world rainy videos, consistently demonstrating its superiority in quantitative metrics, visual quality, efficiency, and its utility for downstream tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes