CVFeb 6, 2024

U-shaped Vision Mamba for Single Image Dehazing

arXiv:2402.04139v472 citationsh-index: 10Has Code
Originality Incremental advance
AI Analysis

This addresses image dehazing for resource-constrained devices by offering a more efficient method, though it appears incremental as it builds on existing State Space Sequence Models.

The paper tackles the problem of single image dehazing by introducing U-shaped Vision Mamba (UVM-Net), an efficient network that reduces computational complexity compared to Transformers, achieving inference in 0.009 seconds for a 325x325 resolution image (100FPS).

Currently, Transformer is the most popular architecture for image dehazing, but due to its large computational complexity, its ability to handle long-range dependency is limited on resource-constrained devices. To tackle this challenge, we introduce the U-shaped Vision Mamba (UVM-Net), an efficient single-image dehazing network. Inspired by the State Space Sequence Models (SSMs), a new deep sequence model known for its power to handle long sequences, we design a Bi-SSM block that integrates the local feature extraction ability of the convolutional layer with the ability of the SSM to capture long-range dependencies. Extensive experimental results demonstrate the effectiveness of our method. Our method provides a more highly efficient idea of long-range dependency modeling for image dehazing as well as other image restoration tasks. The URL of the code is \url{https://github.com/zzr-idam/UVM-Net}. Our method takes only \textbf{0.009} seconds to infer a $325 \times 325$ resolution image (100FPS) without I/O handling time.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes