Aditya Kashi

LG
h-index10
5papers
9citations
Novelty45%
AI Score38

5 Papers

59.1DCApr 7
Fine-Grained Power and Energy Attribution on AMD GPU/APU-Based Exascale Nodes

Adam McDaniel, Michael Jantz, Ashesh Sharma et al.

Modern exascale GPU- and APU-based systems provide multiple power and energy sensors, but differences in scope, update rate, timing, and filtering complicate the attribution of short-lived accelerator activity. This paper presents a methodology to characterize and correct these effects on Cray EX systems with AMD Instinct MI250X GPUs (Frontier) and MI300A APUs (Portage). Using controlled square-wave workloads, we quantify update intervals, delay, aliasing, and variability across up to 512 GPUs and 480 APUs with on-chip (rocm-smi/amd-smi) and off-chip Cray Power Management sensors. We reconstruct power from cumulative energy counters to achieve faster response times, validate it against on-chip, off-chip, and node-level sensors, and integrate the resulting streams into a Score-P/PAPI-based tool for time-aligned, phase-level attribution. Applied to rocHPL, rocHPL-MxP, and HPG-MxP, the method separates energy savings due to reduced runtime from changes in power. Mixed precision reduces node energy on Frontier by 79% for rocHPL-MxP and 31% for HPG-MxP, with similar trends on Portage. These results provide portable guidance for sensor validation and power-aware optimization on current and future exascale systems.

FLU-DYNApr 24, 2024
Machine-Learned Closure of URANS for Stably Stratified Turbulence: Connecting Physical Timescales & Data Hyperparameters of Deep Time-Series Models

Muralikrishnan Gopalakrishnan Meena, Demetri Liousas, Andrew D. Simin et al.

We develop time-series machine learning (ML) methods for closure modeling of the Unsteady Reynolds Averaged Navier Stokes (URANS) equations applied to stably stratified turbulence (SST). SST is strongly affected by fine balances between forces and becomes more anisotropic in time for decaying cases. Moreover, there is a limited understanding of the physical phenomena described by some of the terms in the URANS equations. Rather than attempting to model each term separately, it is attractive to explore the capability of machine learning to model groups of terms, i.e., to directly model the force balances. We consider decaying SST which are homogeneous and stably stratified by a uniform density gradient, enabling dimensionality reduction. We consider two time-series ML models: Long Short-Term Memory (LSTM) and Neural Ordinary Differential Equation (NODE). Both models perform accurately and are numerically stable in a posteriori tests. Furthermore, we explore the data requirements of the ML models by extracting physically relevant timescales of the complex system. We find that the ratio of the timescales of the minimum information required by the ML models to accurately capture the dynamics of the SST corresponds to the Reynolds number of the flow. The current framework provides the backbone to explore the capability of such models to capture the dynamics of higher-dimensional complex SST flows.

LGAug 5, 2025
Intelligent Sampling of Extreme-Scale Turbulence Datasets for Accurate and Efficient Spatiotemporal Model Training

Wesley Brewer, Murali Meena Gopalakrishnan, Matthias Maiterth et al.

With the end of Moore's law and Dennard scaling, efficient training increasingly requires rethinking data volume. Can we train better models with significantly less data via intelligent subsampling? To explore this, we develop SICKLE, a sparse intelligent curation framework for efficient learning, featuring a novel maximum entropy (MaxEnt) sampling approach, scalable training, and energy benchmarking. We compare MaxEnt with random and phase-space sampling on large direct numerical simulation (DNS) datasets of turbulence. Evaluating SICKLE at scale on Frontier, we show that subsampling as a preprocessing step can, in many cases, improve model accuracy and substantially lower energy consumption, with observed reductions of up to 38x.

LGJun 24, 2024
Scalable Artificial Intelligence for Science: Perspectives, Methods and Exemplars

Wesley Brewer, Aditya Kashi, Sajal Dash et al.

In a post-ChatGPT world, this paper explores the potential of leveraging scalable artificial intelligence for scientific discovery. We propose that scaling up artificial intelligence on high-performance computing platforms is essential to address such complex problems. This perspective focuses on scientific use cases like cognitive simulations, large language models for scientific inquiry, medical image analysis, and physics-informed approaches. The study outlines the methodologies needed to address such challenges at scale on supercomputers or the cloud and provides exemplars of such approaches applied to solve a variety of scientific problems.

LGJun 24, 2024
Learning the boundary-to-domain mapping using Lifting Product Fourier Neural Operators for partial differential equations

Aditya Kashi, Arka Daw, Muralikrishnan Gopalakrishnan Meena et al.

Neural operators such as the Fourier Neural Operator (FNO) have been shown to provide resolution-independent deep learning models that can learn mappings between function spaces. For example, an initial condition can be mapped to the solution of a partial differential equation (PDE) at a future time-step using a neural operator. Despite the popularity of neural operators, their use to predict solution functions over a domain given only data over the boundary (such as a spatially varying Dirichlet boundary condition) remains unexplored. In this paper, we refer to such problems as boundary-to-domain problems; they have a wide range of applications in areas such as fluid mechanics, solid mechanics, heat transfer etc. We present a novel FNO-based architecture, named Lifting Product FNO (or LP-FNO) which can map arbitrary boundary functions defined on the lower-dimensional boundary to a solution in the entire domain. Specifically, two FNOs defined on the lower-dimensional boundary are lifted into the higher dimensional domain using our proposed lifting product layer. We demonstrate the efficacy and resolution independence of the proposed LP-FNO for the 2D Poisson equation.