Rao Kotamarthi

LG
h-index31
6papers
120citations
Novelty48%
AI Score41

6 Papers

AIOct 6, 2023
DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

Shuaiwen Leon Song, Bonnie Kruft, Minjia Zhang et al. · microsoft-research

In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences. This could herald a new era of scientific exploration, bringing significant advancements across sectors from drug development to renewable energy. To answer this call, we present DeepSpeed4Science initiative (deepspeed4science.ai) which aims to build unique capabilities through AI system technology innovations to help domain experts to unlock today's biggest science mysteries. By leveraging DeepSpeed's current technology pillars (training, inference and compression) as base technology enablers, DeepSpeed4Science will create a new set of AI system technologies tailored for accelerating scientific discoveries by addressing their unique complexity beyond the common technical approaches used for accelerating generic large language models (LLMs). In this paper, we showcase the early progress we made with DeepSpeed4Science in addressing two of the critical system challenges in structural biology research.

MLMar 23, 2022
A Deep Learning Approach to Probabilistic Forecasting of Weather

Nick Rittler, Carlo Graziani, Jiali Wang et al.

We discuss an approach to probabilistic forecasting based on two chained machine-learning steps: a dimensional reduction step that learns a reduction map of predictor information to a low-dimensional space in a manner designed to preserve information about forecast quantities; and a density estimation step that uses the probabilistic machine learning technique of normalizing flows to compute the joint probability density of reduced predictors and forecast quantities. This joint density is then renormalized to produce the conditional forecast distribution. In this method, probabilistic calibration testing plays the role of a regularization procedure, preventing overfitting in the second step, while effective dimensional reduction from the first step is the source of forecast sharpness. We verify the method using a 22-year 1-hour cadence time series of Weather Research and Forecasting (WRF) simulation data of surface wind on a grid.

LGSep 30, 2025
Swift: An Autoregressive Consistency Model for Efficient Weather Forecasting

Jason Stock, Troy Arcomano, Rao Kotamarthi

Diffusion models offer a physically grounded framework for probabilistic weather forecasting, but their typical reliance on slow, iterative solvers during inference makes them impractical for subseasonal-to-seasonal (S2S) applications where long lead-times and domain-driven calibration are essential. To address this, we introduce Swift, a single-step consistency model that, for the first time, enables autoregressive finetuning of a probability flow model with a continuous ranked probability score (CRPS) objective. This eliminates the need for multi-model ensembling or parameter perturbations. Results show that Swift produces skillful 6-hourly forecasts that remain stable for up to 75 days, running $39\times$ faster than state-of-the-art diffusion baselines while achieving forecast skill competitive with the numerical-based, operational IFS ENS. This marks a step toward efficient and reliable ensemble forecasting from medium-range to seasonal-scales.

LGSep 16, 2025
AERIS: Argonne Earth Systems Model for Reliable and Skillful Predictions

Väinö Hatanpää, Eugene Ku, Jason Stock et al.

Generative machine learning offers new opportunities to better understand complex Earth system dynamics. Recent diffusion-based methods address spectral biases and improve ensemble calibration in weather forecasting compared to deterministic methods, yet have so far proven difficult to scale stably at high resolutions. We introduce AERIS, a 1.3 to 80B parameter pixel-level Swin diffusion transformer to address this gap, and SWiPe, a generalizable technique that composes window parallelism with sequence and pipeline parallelism to shard window-based transformers without added communication cost or increased global batch size. On Aurora (10,080 nodes), AERIS sustains 10.21 ExaFLOPS (mixed precision) and a peak performance of 11.21 ExaFLOPS with $1 \times 1$ patch size on the 0.25° ERA5 dataset, achieving 95.5% weak scaling efficiency, and 81.6% strong scaling efficiency. AERIS outperforms the IFS ENS and remains stable on seasonal scales to 90 days, highlighting the potential of billion-parameter diffusion models for weather and climate prediction.

LGOct 3, 2025
Diffusion-Based, Data-Assimilation-Enabled Super-Resolution of Hub-height Winds

Xiaolong Ma, Xu Dong, Ashley Tarrant et al.

High-quality observations of hub-height winds are valuable but sparse in space and time. Simulations are widely available on regular grids but are generally biased and too coarse to inform wind-farm siting or to assess extreme-weather-related risks (e.g., gusts) at infrastructure scales. To fully utilize both data types for generating high-quality, high-resolution hub-height wind speeds (tens to ~100m above ground), this study introduces WindSR, a diffusion model with data assimilation for super-resolution downscaling of hub-height winds. WindSR integrates sparse observational data with simulation fields during downscaling using state-of-the-art diffusion models. A dynamic-radius blending method is introduced to merge observations with simulations, providing conditioning for the diffusion process. Terrain information is incorporated during both training and inference to account for its role as a key driver of winds. Evaluated against convolutional-neural-network and generative-adversarial-network baselines, WindSR outperforms them in both downscaling efficiency and accuracy. Our data assimilation reduces WindSR's model bias by approximately 20% relative to independent observations.

LGJan 18, 2021
Fast and accurate learned multiresolution dynamical downscaling for precipitation

Jiali Wang, Zhengchun Liu, Ian Foster et al.

This study develops a neural network-based approach for emulating high-resolution modeled precipitation data with comparable statistical properties but at greatly reduced computational cost. The key idea is to use combination of low- and high- resolution simulations to train a neural network to map from the former to the latter. Specifically, we define two types of CNNs, one that stacks variables directly and one that encodes each variable before stacking, and we train each CNN type both with a conventional loss function, such as mean square error (MSE), and with a conditional generative adversarial network (CGAN), for a total of four CNN variants. We compare the four new CNN-derived high-resolution precipitation results with precipitation generated from original high resolution simulations, a bilinear interpolater and the state-of-the-art CNN-based super-resolution (SR) technique. Results show that the SR technique produces results similar to those of the bilinear interpolator with smoother spatial and temporal distributions and smaller data variabilities and extremes than the original high resolution simulations. While the new CNNs trained by MSE generate better results over some regions than the interpolator and SR technique do, their predictions are still not as close as the original high resolution simulations. The CNNs trained by CGAN generate more realistic and physically reasonable results, better capturing not only data variability in time and space but also extremes such as intense and long-lasting storms. The new proposed CNN-based downscaling approach can downscale precipitation from 50~km to 12~km in 14~min for 30~years once the network is trained (training takes 4~hours using 1~GPU), while the conventional dynamical downscaling would take 1~month using 600 CPU cores to generate simulations at the resolution of 12~km over contiguous United States.