21.9LGJun 2
Rethinking Neural Width for Alternating Current Optimal Power Flow ProxiesDhruvi Khandelwal, Anurag Basistha, Ayushi Jolotia et al.
Deep learning proxies for Alternating Current Optimal Power Flow (ACOPF) lack systematic methods for determining architectural size. This paper conducts a constructive thought experiment to answer a fundamental inquiry: how wide must a neural network be to almost accurately approximate the ACOPF manifold? We introduce a Loss-Guided Neural Densification (LG-ND) algorithm that incrementally discovers necessary capacity by expanding only when the current deep neural network topology fails to improve further. Empirical results across various IEEE systems show that LG-ND achieves performance parity with literature baselines using up to ten times fewer neurons per layer. Such architectural minimalism is critical for the formal verification required in safety-critical grid operations.
LGOct 1, 2023
Data-Efficient Strategies for Probabilistic Voltage Envelopes under Network ContingenciesParikshit Pareek, Deepjyoti Deka, Sidhant Misra
This work presents an efficient data-driven method to construct probabilistic voltage envelopes (PVE) using power flow learning in grids with network contingencies. First, a network-aware Gaussian process (GP) termed Vertex-Degree Kernel (VDK-GP), developed in prior work, is used to estimate voltage-power functions for a few network configurations. The paper introduces a novel multi-task vertex degree kernel (MT-VDK) that amalgamates the learned VDK-GPs to determine power flows for unseen networks, with a significant reduction in the computational complexity and hyperparameter requirements compared to alternate approaches. Simulations on the IEEE 30-Bus network demonstrate the retention and transfer of power flow knowledge in both N-1 and N-2 contingency scenarios. The MT-VDK-GP approach achieves over 50% reduction in mean prediction error for novel N-1 contingency network configurations in low training data regimes (50-250 samples) over VDK-GP. Additionally, MT-VDK-GP outperforms a hyper-parameter based transfer learning approach in over 75% of N-2 contingency network structures, even without historical N-2 outage data. The proposed method demonstrates the ability to achieve PVEs using sixteen times fewer power flow solutions compared to Monte-Carlo sampling-based methods.
SYAug 15, 2023
Learning Power Flow with Confidence: A Probabilistic Guarantee Framework for Voltage RiskParikshit Pareek, Sidhant Misra, Deepjyoti Deka
The absence of formal performance guarantees in machine learning (ML) has limited its adoption for safety-critical power system applications, where confidence and interpretability are as vital as accuracy. In this work, we present a probabilistic guarantee for power flow learning and voltage risk estimation, derived through the framework of Gaussian Process (GP) regression. Specifically, we establish a bound on the expected estimation error that connects the GP's predictive variance to confidence in voltage risk estimates, ensuring statistical equivalence with Monte Carlo-based ACPF risk quantification. To enhance model learnability in the low-data regime, we first design the Vertex-Degree Kernel (VDK), a topology-aware additive kernel that decomposes voltage-load interactions into local neighborhoods for efficient large-scale learning. Building on this, we introduce a network-swipe active learning (AL) algorithm that adaptively samples informative operating points and provides a principled stopping criterion without requiring out-of-sample validation. Together, these developments mitigate the principal bottleneck of ML-based power flow-its lack of guaranteed reliability-by combining data efficiency with analytical assurance. Empirical evaluations across IEEE 118-, 500-, and 1354-bus systems confirm that the proposed VDK-GP achieves mean absolute voltage errors below 1E-03 p.u., reproduces Monte Carlo-level voltage risk estimates with 15x fewer ACPF computations, and achieves over 120x reduction in evaluation time while conservatively bounding violation probabilities.
LGJan 29
Amortized Spectral Kernel Discovery via Prior-Data Fitted NetworkKaustubh Sharma, Srijan Tiwari, Ojasva Nema et al.
Prior-Data Fitted Networks (PFNs) enable efficient amortized inference but lack transparent access to their learned priors and kernels. This opacity hinders their use in downstream tasks, such as surrogate-based optimization, that require explicit covariance models. We introduce an interpretability-driven framework for amortized spectral discovery from pre-trained PFNs with decoupled attention. We perform a mechanistic analysis on a trained PFN that identifies attention latent output as the key intermediary, linking observed function data to spectral structure. Building on this insight, we propose decoder architectures that map PFN latents to explicit spectral density estimates and corresponding stationary kernels via Bochner's theorem. We study this pipeline in both single-realization and multi-realization regimes, contextualizing theoretical limits on spectral identifiability and proving consistency when multiple function samples are available. Empirically, the proposed decoders recover complex multi-peak spectral mixtures and produce explicit kernels that support Gaussian process regression with accuracy comparable to PFNs and optimization-based baselines, while requiring only a single forward pass. This yields orders-of-magnitude reductions in inference time compared to optimization-based baselines.
LGFeb 5
Structural Disentanglement in Bilinear MLPs via Architectural Inductive BiasOjasva Nema, Kaustubh Sharma, Aditya Chauhan et al.
Selective unlearning and long-horizon extrapolation remain fragile in modern neural networks, even when tasks have underlying algebraic structure. In this work, we argue that these failures arise not solely from optimization or unlearning algorithms, but from how models structure their internal representations during training. We explore if having explicit multiplicative interactions as an architectural inductive bias helps in structural disentanglement, through Bilinear MLPs. We show analytically that bilinear parameterizations possess a `non-mixing' property under gradient flow conditions, where functional components separate into orthogonal subspace representations. This provides a mathematical foundation for surgical model modification. We validate this hypothesis through a series of controlled experiments spanning modular arithmetic, cyclic reasoning, Lie group dynamics, and targeted unlearning benchmarks. Unlike pointwise nonlinear networks, multiplicative architectures are able to recover true operators aligned with the underlying algebraic structure. Our results suggest that model editability and generalization are constrained by representational structure, and that architectural inductive bias plays a central role in enabling reliable unlearning.
LGSep 25, 2025
Decoupled-Value Attention for Prior-Data Fitted Networks: GP Inference for Physical EquationsKaustubh Sharma, Simardeep Singh, Parikshit Pareek
Prior-data fitted networks (PFNs) are a promising alternative to time-consuming Gaussian Process (GP) inference for creating fast surrogates of physical systems. PFN reduces the computational burden of GP-training by replacing Bayesian inference in GP with a single forward pass of a learned prediction model. However, with standard Transformer attention, PFNs show limited effectiveness on high-dimensional regression tasks. We introduce Decoupled-Value Attention (DVA)-- motivated by the GP property that the function space is fully characterized by the kernel over inputs and the predictive mean is a weighted sum of training targets. DVA computes similarities from inputs only and propagates labels solely through values. Thus, the proposed DVA mirrors the Gaussian-process update while remaining kernel-free. We demonstrate that the crucial factor for scaling PFNs is the attention rule rather than the architecture itself. Specifically, our results demonstrate that (a) localized attention consistently reduces out-of-sample validation loss in PFNs across different dimensional settings, with validation loss reduced by more than 50% in five- and ten-dimensional cases, and (b) the role of attention is more decisive than the choice of backbone architecture, showing that CNN-based PFNs can perform at par with their Transformer-based counterparts. The proposed PFNs provide 64-dimensional power flow equation approximations with a mean absolute error of the order of 1E-3, while being over 80x faster than exact GP inference.
LGSep 19, 2025
Small LLMs with Expert Blocks Are Good Enough for Hyperparamter TuningOm Naphade, Saksham Bansal, Parikshit Pareek
Hyper-parameter Tuning (HPT) is a necessary step in machine learning (ML) pipelines but becomes computationally expensive and opaque with larger models. Recently, Large Language Models (LLMs) have been explored for HPT, yet most rely on models exceeding 100 billion parameters. We propose an Expert Block Framework for HPT using Small LLMs. At its core is the Trajectory Context Summarizer (TCS), a deterministic block that transforms raw training trajectories into structured context, enabling small LLMs to analyze optimization progress with reliability comparable to larger models. Using two locally-run LLMs (phi4:reasoning14B and qwen2.5-coder:32B) and a 10-trial budget, our TCS-enabled HPT pipeline achieves average performance within ~0.9 percentage points of GPT-4 across six diverse tasks.
SYApr 30, 2025
Power Flow Approximations for Multiphase Distribution Networks using Gaussian ProcessesDaniel Glover, Parikshit Pareek, Deepjyoti Deka et al.
Learning-based approaches are increasingly leveraged to manage and coordinate the operation of grid-edge resources in active power distribution networks. Among these, model-based techniques stand out for their superior data efficiency and robustness compared to model-free methods. However, effective model learning requires a learning-based approximator for the underlying power flow model. This study extends existing work by introducing a data-driven power flow method based on Gaussian Processes (GPs) to approximate the multiphase power flow model, by mapping net load injections to nodal voltages. Simulation results using the IEEE 123-bus and 8500-node distribution test feeders demonstrate that the trained GP model can reliably predict the nonlinear power flow solutions with minimal training data. We also conduct a comparative analysis of the training efficiency and testing performance of the proposed GP-based power flow approximator against a deep neural network-based approximator, highlighting the advantages of our data-efficient approach. Results over realistic operating conditions show that despite an 85% reduction in the training sample size (corresponding to a 92.8% improvement in training time), GP models produce a 99.9% relative reduction in mean absolute error compared to the baselines of deep neural networks.
SYApr 16, 2020
Gaussian Process Learning-based Probabilistic Optimal Power FlowParikshit Pareek, Hung D. Nguyen
In this letter, we present a novel Gaussian Process Learning-based Probabilistic Optimal Power Flow (GP-POPF) for solving POPF under renewable and load uncertainties of arbitrary distribution. The proposed method relies on a non-parametric Bayesian inference-based uncertainty propagation approach, called Gaussian Process (GP). We also suggest a new type of sensitivity called Subspace-wise Sensitivity, using observations on the interpretability of GP-POPF hyperparameters. The simulation results on 14-bus and 30-bus systems show that the proposed method provides reasonably accurate solutions when compared with Monte-Carlo Simulations (MCS) solutions at different levels of uncertain renewable penetration as well as load uncertainties, while requiring much less number of samples and elapsed time.
SYNov 8, 2019
Non-parametric Probabilistic Load Flow using Gaussian Process LearningParikshit Pareek, Chuan Wang, Hung D. Nguyen
In this work, we propose a non-parametric probabilistic load flow (NP-PLF) technique based on the Gaussian Process (GP) learning to understand the power system behavior under uncertainty for better operational decisions. The technique can provide "semi-explicit" power flow solutions by implementing the learning and testing steps which map control variables to inputs. The proposed NP-PLF leverages upon GP upper confidence bound (GP-UCB) sampling algorithm. The salient features of this NP-PLF method are: i) applicable for power flow problem having power injection uncertainty with an unknown class of distribution; ii) providing probabilistic learning bound (PLB) which further provides control over the error and convergence; iii) capable of handling intermittent distributed generation as well as load uncertainties, and iv) applicable to both balanced and unbalanced power flow with different type and size of power systems. The simulation results performed on the IEEE 30-bus and IEEE 118-bus system show that the proposed method can learn the voltage function over the power injection subspace using a small number of training samples. Further, the testing with different input uncertainty distributions indicates that complete statistical information can be obtained for the probabilistic load flow problem with average percentage relative error of order $10^{-3}$\% on 50000 test points.