CVLGJul 29, 2024

Towards Knowledge Guided Pretraining Approaches for Multimodal Foundation Models: Applications in Remote Sensing

arXiv:2407.19660v42 citationsh-index: 19
Originality Incremental advance
AI Analysis

This work addresses the problem of improving multimodal foundation models for remote sensing applications by incorporating causal knowledge, representing an incremental advancement over existing pretraining paradigms.

The paper tackles the limitation of existing pretraining approaches that fail to capture causal relationships between geospatial variables by proposing Knowledge Guided Variable-Step Forecasting (KG-VSF), a novel pretraining task that models forecasting as conditional generation using driver variables. This approach leads to enhanced performance on downstream tasks like crop type mapping and soil moisture estimation compared to standard pretraining methods.

Self-supervised learning has emerged as a powerful paradigm for pretraining foundation models using large-scale data. Existing pretraining approaches predominantly rely on masked reconstruction or next-token prediction strategies, demonstrating strong performance across various downstream tasks, including geoscience applications. However, these approaches do not fully capture the knowledge of causal interplay between different geospatial and environmental variables. To address this limitation, we propose Knowledge Guided Variable-Step Forecasting (KG-VSF), a novel pretraining task that models forecasting as a conditional generation task, where driver variables (e.g., weather) inform the prediction of response variables (e.g., satellite imagery). We demonstrate that pretraining in such a fashion leads to strong embeddings which give enhanced performance when finetuned on downstream tasks where capturing this causality matters such as pixel wise crop type mapping, soil moisture estimation and forecasting, missing image prediction, and future image forecasting when compared to finetuning embeddings from other standard pretraining approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes