CVFeb 10, 2025

ViSIR: Vision Transformer Single Image Reconstruction Method for Earth System Models

arXiv:2502.06741v31 citationsh-index: 11IEEE Pulse
Originality Incremental advance
AI Analysis

This work addresses the need for high-quality data reconstruction in climate modeling, though it appears incremental as it builds on and combines existing techniques.

The paper tackled the problem of single image super-resolution reconstruction for Earth system model data by proposing ViSIR, which combines Vision Transformers and Sinusoidal Representation Networks, resulting in performance improvements of up to 8.34 dB PSNR over existing methods.

Purpose: Earth system models (ESMs) integrate the interactions of the atmosphere, ocean, land, ice, and biosphere to estimate the state of regional and global climate under a wide variety of conditions. The ESMs are highly complex; thus, deep neural network architectures are used to model the complexity and store the down-sampled data. This paper proposes the Vision Transformer Sinusoidal Representation Networks (ViSIR) to improve the ESM data's single image SR (SR) reconstruction task. Methods: ViSIR combines the SR capability of Vision Transformers (ViT) with the high-frequency detail preservation of the Sinusoidal Representation Network (SIREN) to address the spectral bias observed in SR tasks. Results: The ViSIR outperforms SRCNN by 2.16 db, ViT by 6.29 dB, SIREN by 8.34 dB, and SR-Generative Adversarial (SRGANs) by 7.93 dB PSNR on average for three different measurements. Conclusion: The proposed ViSIR is evaluated and compared with state-of-the-art methods. The results show that the proposed algorithm is outperforming other methods in terms of Mean Square Error(MSE), Peak-Signal-to-Noise-Ratio(PSNR), and Structural Similarity Index Measure(SSIM).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes