UHD Image Dehazing via anDehazeFormer with Atmospheric-aware KV Cache
This work addresses computational efficiency challenges for researchers and practitioners in image restoration, particularly for 4K/8K dehazing tasks, though it is incremental as it builds on existing transformer-based methods.
The paper tackles the problem of slow training and high memory consumption in ultra-high-definition image dehazing by proposing an efficient visual transformer framework, achieving a 5x faster training convergence speed and real-time processing of 50 high-resolution images per second on an RTX4090 GPU while maintaining state-of-the-art dehazing quality.
In this paper, we propose an efficient visual transformer framework for ultra-high-definition (UHD) image dehazing that addresses the key challenges of slow training speed and high memory consumption for existing methods. Our approach introduces two key innovations: 1) an \textbf{a}daptive \textbf{n}ormalization mechanism inspired by the nGPT architecture that enables ultra-fast and stable training with a network with a restricted range of parameter expressions; and 2) we devise an atmospheric scattering-aware KV caching mechanism that dynamically optimizes feature preservation based on the physical haze formation model. The proposed architecture improves the training convergence speed by \textbf{5 $\times$} while reducing memory overhead, enabling real-time processing of 50 high-resolution images per second on an RTX4090 GPU. Experimental results show that our approach maintains state-of-the-art dehazing quality while significantly improving computational efficiency for 4K/8K image restoration tasks. Furthermore, we provide a new dehazing image interpretable method with the help of an integrated gradient attribution map. Our code can be found here: https://anonymous.4open.science/r/anDehazeFormer-632E/README.md.