Stereo Waterdrop Removal with Row-wise Dilated Attention
This addresses a critical issue for vision systems in autonomous vehicles and robots by improving waterdrop removal, though it is incremental as it builds on existing stereo and attention methods.
The paper tackles the problem of waterdrop removal from stereo images for autonomous driving and robotics, proposing a learning-based model with a row-wise dilated attention module and attention consistency loss, which outperforms state-of-the-art methods on a newly collected real-world dataset.
Existing vision systems for autonomous driving or robots are sensitive to waterdrops adhered to windows or camera lenses. Most recent waterdrop removal approaches take a single image as input and often fail to recover the missing content behind waterdrops faithfully. Thus, we propose a learning-based model for waterdrop removal with stereo images. To better detect and remove waterdrops from stereo images, we propose a novel row-wise dilated attention module to enlarge attention's receptive field for effective information propagation between the two stereo images. In addition, we propose an attention consistency loss between the ground-truth disparity map and attention scores to enhance the left-right consistency in stereo images. Because of related datasets' unavailability, we collect a real-world dataset that contains stereo images with and without waterdrops. Extensive experiments on our dataset suggest that our model outperforms state-of-the-art methods both quantitatively and qualitatively. Our source code and the stereo waterdrop dataset are available at \href{https://github.com/VivianSZF/Stereo-Waterdrop-Removal}{https://github.com/VivianSZF/Stereo-Waterdrop-Removal}