CVIVJun 8, 2023

Neighborhood Attention Makes the Encoder of ResUNet Stronger for Accurate Road Extraction

arXiv:2306.04947v111 citationsh-index: 74Has Code
Originality Incremental advance
AI Analysis

This work addresses road extraction for remote sensing applications, presenting an incremental improvement over existing methods.

The paper tackles road extraction from aerial imagery by proposing ResUNetFormer, a neural network combining residual learning, HetConvs, UNet, and vision transformers, which outperforms state-of-the-art CNNs and vision transformers on the Massachusetts road dataset.

In the domain of remote sensing image interpretation, road extraction from high-resolution aerial imagery has already been a hot research topic. Although deep CNNs have presented excellent results for semantic segmentation, the efficiency and capabilities of vision transformers are yet to be fully researched. As such, for accurate road extraction, a deep semantic segmentation neural network that utilizes the abilities of residual learning, HetConvs, UNet, and vision transformers, which is called \texttt{ResUNetFormer}, is proposed in this letter. The developed \texttt{ResUNetFormer} is evaluated on various cutting-edge deep learning-based road extraction techniques on the public Massachusetts road dataset. Statistical and visual results demonstrate the superiority of the \texttt{ResUNetFormer} over the state-of-the-art CNNs and vision transformers for segmentation. The code will be made available publicly at \url{https://github.com/aj1365/ResUNetFormer}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes