CVFeb 18, 2025

Frequency-Aware Vision Transformers for High-Fidelity Super-Resolution of Earth System Models

arXiv:2502.12427v41 citationsh-index: 11Sci Rep
Originality Incremental advance
AI Analysis

This addresses the need for high-fidelity spatial enhancement of climate simulations for climate science applications, representing a domain-specific advancement.

The paper tackles the problem of spectral bias in super-resolution methods for Earth System Models, where traditional approaches reconstruct low-frequency content better than high-frequency details, and introduces two frequency-aware vision transformer frameworks (ViSIR and ViFOR) that achieve up to 2.6 dB improvements in PSNR on climate data.

Super-resolution (SR) is crucial for enhancing the spatial fidelity of Earth System Model (ESM) outputs, allowing fine-scale structures vital to climate science to be recovered from coarse simulations. However, traditional deep super-resolution methods, including convolutional and transformer-based models, tend to exhibit spectral bias, reconstructing low-frequency content more readily than valuable high-frequency details. In this work, we introduce two frequency-aware frameworks: the Vision Transformer-Tuned Sinusoidal Implicit Representation (ViSIR), combining Vision Transformers and sinusoidal activations to mitigate spectral bias, and the Vision Transformer Fourier Representation Network (ViFOR), which integrates explicit Fourier-based filtering for independent low- and high-frequency learning. Evaluated on the E3SM-HR Earth system dataset across surface temperature, shortwave, and longwave fluxes, these models outperform leading CNN, GAN, and vanilla transformer baselines, with ViFOR demonstrating up to 2.6~dB improvements in PSNR and significantly higher SSIM. Detailed ablation and scaling studies highlight the benefit of full-field training, the impact of frequency hyperparameters, and the potential for generalization. The results establish ViFOR as a state-of-the-art, scalable solution for climate data downscaling. Future extensions will address temporal super-resolution, multimodal climate variables, automated parameter selection, and integration of physical conservation constraints to broaden scientific applicability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes