CVJan 29, 2024

Cross-Scale MAE: A Tale of Multi-Scale Exploitation in Remote Sensing

arXiv:2401.15855v174 citationsh-index: 2NIPS
Originality Incremental advance
AI Analysis

This work addresses challenges in remote sensing image analysis, such as geographic coverage and multi-scale misalignment, offering an incremental improvement for domain-specific applications.

The paper tackles the problem of multi-scale representation learning for remote sensing images by proposing Cross-Scale MAE, a self-supervised model that uses scale augmentation and cross-scale consistency constraints, resulting in superior performance compared to standard MAE and other state-of-the-art methods in remote sensing.

Remote sensing images present unique challenges to image analysis due to the extensive geographic coverage, hardware limitations, and misaligned multi-scale images. This paper revisits the classical multi-scale representation learning problem but under the general framework of self-supervised learning for remote sensing image understanding. We present Cross-Scale MAE, a self-supervised model built upon the Masked Auto-Encoder (MAE).During pre-training, Cross-Scale MAE employs scale augmentation techniques and enforces cross-scale consistency constraints through both contrastive and generative losses to ensure consistent and meaningful representations well-suited for a wide range of downstream tasks. Further, our implementation leverages the xFormers library to accelerate network pre-training on a single GPU while maintaining the quality of learned representations. Experimental evaluations demonstrate that Cross-Scale MAE exhibits superior performance compared to standard MAE and other state-of-the-art remote sensing MAE methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes