CVMay 22, 2024

MetaEarth: A Generative Foundation Model for Global-Scale Remote Sensing Image Generation

arXiv:2405.13570v363 citationsh-index: 41IEEE Trans Pattern Anal Mach Intell
Originality Incremental advance
AI Analysis

This addresses the need for scalable image generation in remote sensing, offering potential applications in data augmentation and world simulation, though it appears incremental by extending generative models to a new domain.

The paper tackles the problem of generating global-scale remote sensing images, which existing methods are limited to smaller scenes, by introducing MetaEarth, a generative foundation model that produces worldwide, multi-resolution, unbounded images, demonstrated through experiments showing its powerful capabilities.

The recent advancement of generative foundational models has ushered in a new era of image generation in the realm of natural images, revolutionizing art design, entertainment, environment simulation, and beyond. Despite producing high-quality samples, existing methods are constrained to generating images of scenes at a limited scale. In this paper, we present MetaEarth, a generative foundation model that breaks the barrier by scaling image generation to a global level, exploring the creation of worldwide, multi-resolution, unbounded, and virtually limitless remote sensing images. In MetaEarth, we propose a resolution-guided self-cascading generative framework, which enables the generating of images at any region with a wide range of geographical resolutions. To achieve unbounded and arbitrary-sized image generation, we design a novel noise sampling strategy for denoising diffusion models by analyzing the generation conditions and initial noise. To train MetaEarth, we construct a large dataset comprising multi-resolution optical remote sensing images with geographical information. Experiments have demonstrated the powerful capabilities of our method in generating global-scale images. Additionally, the MetaEarth serves as a data engine that can provide high-quality and rich training data for downstream tasks. Our model opens up new possibilities for constructing generative world models by simulating Earth visuals from an innovative overhead perspective.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes