CV AI LGOct 21, 2025

SITS-DECO: A Generative Decoder Is All You Need For Multitask Satellite Image Time Series Modelling

arXiv:2510.21813v1

Originality Highly original

AI Analysis

This work provides a lightweight, practical approach for multi-task Earth Observation modeling, bridging toward generative foundation models in the domain.

The paper tackled the problem of rigid adaptation requirements in Earth Observation foundation models by introducing SITS-DECO, a generative decoder-only model that uses unified token sequences for multitask satellite image time series modeling, and it outperformed larger models on crop-type classification with PASTIS-R.

Earth Observation (EO) Foundation Modelling (FM) holds great promise for simplifying and improving the use of EO data for diverse real-world tasks. However, most existing models require additional adaptation before they can be used and are structured rigidly around particular data sources or training approaches. To address this, we take inspiration from large language models, where diverse tasks, both pre-training and downstream, are implicitly captured through next-token prediction over unified token sequences, leveraging the structure and diversity of the training data. We introduce SITS-DECO (Satellite Image Time Series-DECoder Only), a proof-of-concept generative model that applies this unified-sequence framing to EO data. Using a simple GPT-style decoder-only architecture, and demonstrate its ability to perform useful EO tasks (pixel-wise, multi-temporal, multi-modal crop-type classification) in a purely generative framework. Through symbolic prompting, we show that the model can perform multiple supervised and self-supervised tasks within a single unified architecture, without task- or modality-specific adaptation. Despite its simplicity and lack of spatial context, SITS-DECO outperforms much larger EO foundation models on crop-type classification (PASTIS-R) demonstrating that dense temporal sequence modelling is a critical missing ingredient in the current paradigm. This work exemplifies a data-centric modelling paradigm in which capability arises from the diversity and structure of the training data rather than from architectural complexity. SITS-DECO provides a lightweight, practical route to multi-modal, multi-task EO modelling, and a conceptual bridge toward future generative EO foundation models.

View on arXiv PDF

Similar