GR CVApr 11, 2025

COP-GEN-Beta: Unified Generative Modelling of COPernicus Imagery Thumbnails

Miguel Espinosa, Valerio Marsocci, Yuru Jia, Elliot J. Crowley, Mikolaj Czerkawski

arXiv:2504.08548v24 citationsh-index: 72025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Originality Incremental advance

AI Analysis

This addresses the problem of multi-modal data integration in remote sensing, offering a novel tool for researchers and practitioners, though it appears incremental in building on existing generative methods.

The paper tackles the challenge of learning unified representations across multi-modal remote sensing data by introducing COP-GEN-Beta, a generative diffusion model that enables zero-shot modality translation between optical, radar, and elevation data, achieving high-quality sample generation as validated by evaluations.

In remote sensing, multi-modal data from various sensors capturing the same scene offers rich opportunities, but learning a unified representation across these modalities remains a significant challenge. Traditional methods have often been limited to single or dual-modality approaches. In this paper, we introduce COP-GEN-Beta, a generative diffusion model trained on optical, radar, and elevation data from the Major TOM dataset. What sets COP-GEN-Beta apart is its ability to map any subset of modalities to any other, enabling zero-shot modality translation after training. This is achieved through a sequence-based diffusion transformer, where each modality is controlled by its own timestep embedding. We extensively evaluate COP-GEN-Beta on thumbnail images from the Major TOM dataset, demonstrating its effectiveness in generating high-quality samples. Qualitative and quantitative evaluations validate the model's performance, highlighting its potential as a powerful pre-trained model for future remote sensing tasks.

View on arXiv PDF

Similar