Muralikrishnan Gopalakrishnan Meena

FLU-DYN
h-index11
4papers
7citations
Novelty61%
AI Score42

4 Papers

18.1QUANT-PHApr 15
Distributed Variational Quantum Linear Solver

Chao Lu, Pooja Rao, Muralikrishnan Gopalakrishnan Meena et al.

The Variational Quantum Linear Solver (VQLS), a hybrid quantum-classical algorithm for solving linear systems, faces a practical scalability bottleneck: the Linear Combination of Unitaries (LCU) decomposition requires O(L^2) circuit evaluations per optimizer iteration, where $L$ can grow as 4^n for n-qubit systems for the worst case scenario. We address this computational bottleneck through two complementary strategies. First, we present a distributed VQLS (D-VQLS) framework, built on NVIDIA CUDA-Q, that enables asynchronous, scalable distribution of the O(L^2) cost-function evaluations. Second, a fast Walsh--Hadamard transform (FWHT)-based Pauli decomposition with 1% coefficient thresholding curbs the exponential growth of LCU terms, reducing L from O}(2^n) to O(1) for n > 6 qubits and compressing the per-iteration circuit complexity from O(n * 4^n) to O(n) for sparse, structured matrices. For a 10-qubit tridiagonal Toeplitz system, this yields a 256x reduction, from 23 million to 90,112 circuits per iteration, while preserving over $99.99\%$ solution fidelity. Additionally, to inform feasibility on early fault-tolerant QPUs, the paper provides resource estimates -- gate counts, qubit requirements, and circuit evaluations per iteration -- for VQLS applied to arbitrary matrices. The D-VQLS framework is validated on the NERSC Perlmutter supercomputer using multi-node, multi-GPU ideal state-vector simulations, achieving over 99.99% fidelity against classical solutions on tridiagonal Toeplitz and Hele--Shaw flow benchmarks, with near-ideal strong scaling up to 24 GPUs and 95.3% weak scaling efficiency at 96 GPUs processing 360,448 circuits per iteration for a 10-qubit system. Systematic profiling identifies the optimal resource allocation for distributed quantum circuit workloads, yielding a 2.52x speedup for the configurations studied.

FLU-DYNApr 24, 2024
Machine-Learned Closure of URANS for Stably Stratified Turbulence: Connecting Physical Timescales & Data Hyperparameters of Deep Time-Series Models

Muralikrishnan Gopalakrishnan Meena, Demetri Liousas, Andrew D. Simin et al.

We develop time-series machine learning (ML) methods for closure modeling of the Unsteady Reynolds Averaged Navier Stokes (URANS) equations applied to stably stratified turbulence (SST). SST is strongly affected by fine balances between forces and becomes more anisotropic in time for decaying cases. Moreover, there is a limited understanding of the physical phenomena described by some of the terms in the URANS equations. Rather than attempting to model each term separately, it is attractive to explore the capability of machine learning to model groups of terms, i.e., to directly model the force balances. We consider decaying SST which are homogeneous and stably stratified by a uniform density gradient, enabling dimensionality reduction. We consider two time-series ML models: Long Short-Term Memory (LSTM) and Neural Ordinary Differential Equation (NODE). Both models perform accurately and are numerically stable in a posteriori tests. Furthermore, we explore the data requirements of the ML models by extracting physically relevant timescales of the complex system. We find that the ratio of the timescales of the minimum information required by the ML models to accurately capture the dynamics of the SST corresponds to the Reynolds number of the flow. The current framework provides the backbone to explore the capability of such models to capture the dynamics of higher-dimensional complex SST flows.

FLU-DYNJul 22, 2025
Pixel-Resolved Long-Context Learning for Turbulence at Exascale: Resolving Small-scale Eddies Toward the Viscous Limit

Junqi Yin, Mijanur Palash, M. Paul Laiu et al.

Turbulence plays a crucial role in multiphysics applications, including aerodynamics, fusion, and combustion. Accurately capturing turbulence's multiscale characteristics is essential for reliable predictions of multiphysics interactions, but remains a grand challenge even for exascale supercomputers and advanced deep learning models. The extreme-resolution data required to represent turbulence, ranging from billions to trillions of grid points, pose prohibitive computational costs for models based on architectures like vision transformers. To address this challenge, we introduce a multiscale hierarchical Turbulence Transformer that reduces sequence length from billions to a few millions and a novel RingX sequence parallelism approach that enables scalable long-context learning. We perform scaling and science runs on the Frontier supercomputer. Our approach demonstrates excellent performance up to 1.1 EFLOPS on 32,768 AMD GPUs, with a scaling efficiency of 94%. To our knowledge, this is the first AI model for turbulence that can capture small-scale eddies down to the dissipative range.

LGJun 24, 2024
Learning the boundary-to-domain mapping using Lifting Product Fourier Neural Operators for partial differential equations

Aditya Kashi, Arka Daw, Muralikrishnan Gopalakrishnan Meena et al.

Neural operators such as the Fourier Neural Operator (FNO) have been shown to provide resolution-independent deep learning models that can learn mappings between function spaces. For example, an initial condition can be mapped to the solution of a partial differential equation (PDE) at a future time-step using a neural operator. Despite the popularity of neural operators, their use to predict solution functions over a domain given only data over the boundary (such as a spatially varying Dirichlet boundary condition) remains unexplored. In this paper, we refer to such problems as boundary-to-domain problems; they have a wide range of applications in areas such as fluid mechanics, solid mechanics, heat transfer etc. We present a novel FNO-based architecture, named Lifting Product FNO (or LP-FNO) which can map arbitrary boundary functions defined on the lower-dimensional boundary to a solution in the entire domain. Specifically, two FNOs defined on the lower-dimensional boundary are lifted into the higher dimensional domain using our proposed lifting product layer. We demonstrate the efficacy and resolution independence of the proposed LP-FNO for the 2D Poisson equation.