Multi-Modal Deep Learning for Multi-Temporal Urban Mapping With a Partly Missing Optical Modality
This work addresses urban mapping challenges for remote sensing applications, but it is incremental as it builds on existing multi-modal methods with a specific focus on handling missing data.
The paper tackles the problem of multi-temporal urban mapping with partly missing optical satellite data due to clouds by proposing a multi-modal deep learning approach using SAR and optical data, and it outperforms baseline methods that use zero replacement or uni-modal SAR data.
This paper proposes a novel multi-temporal urban mapping approach using multi-modal satellite data from the Sentinel-1 Synthetic Aperture Radar (SAR) and Sentinel-2 MultiSpectral Instrument (MSI) missions. In particular, it focuses on the problem of a partly missing optical modality due to clouds. The proposed model utilizes two networks to extract features from each modality separately. In addition, a reconstruction network is utilized to approximate the optical features based on the SAR data in case of a missing optical modality. Our experiments on a multi-temporal urban mapping dataset with Sentinel-1 SAR and Sentinel-2 MSI data demonstrate that the proposed method outperforms a multi-modal approach that uses zero values as a replacement for missing optical data, as well as a uni-modal SAR-based approach. Therefore, the proposed method is effective in exploiting multi-modal data, if available, but it also retains its effectiveness in case the optical modality is missing.