CVIVApr 11, 2023

Investigating Imbalances Between SAR and Optical Utilization for Multi-Modal Urban Mapping

arXiv:2304.05080v11 citationsh-index: 43
Originality Synthesis-oriented
AI Analysis

This addresses a problem for urban mapping researchers by highlighting a potential bottleneck in multi-modal learning, though it is incremental as it identifies an issue without solving it.

The paper investigates the imbalanced utilization of SAR and optical data in multi-modal deep neural networks for urban mapping, finding that conventional fusion achieves an F1 score of 0.682 ± 0.014 but under-utilizes optical data.

Accurate urban maps provide essential information to support sustainable urban development. Recent urban mapping methods use multi-modal deep neural networks to fuse Synthetic Aperture Radar (SAR) and optical data. However, multi-modal networks may rely on just one modality due to the greedy nature of learning. In turn, the imbalanced utilization of modalities can negatively affect the generalization ability of a network. In this paper, we investigate the utilization of SAR and optical data for urban mapping. To that end, a dual-branch network architecture using intermediate fusion modules to share information between the uni-modal branches is utilized. A cut-off mechanism in the fusion modules enables the stopping of information flow between the branches, which is used to estimate the network's dependence on SAR and optical data. While our experiments on the SEN12 Global Urban Mapping dataset show that good performance can be achieved with conventional SAR-optical data fusion (F1 score = 0.682 $\pm$ 0.014), we also observed a clear under-utilization of optical data. Therefore, future work is required to investigate whether a more balanced utilization of SAR and optical data can lead to performance improvements.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes