LGSep 15, 2025
Nonlocal Neural Tangent Kernels via Parameter-Space InteractionsSriram Nagaraj, Vishakh Hari
The Neural Tangent Kernel (NTK) framework has provided deep insights into the training dynamics of neural networks under gradient flow. However, it relies on the assumption that the network is differentiable with respect to its parameters, an assumption that breaks down when considering non-smooth target functions or parameterized models exhibiting non-differentiable behavior. In this work, we propose a Nonlocal Neural Tangent Kernel (NNTK) that replaces the local gradient with a nonlocal interaction-based approximation in parameter space. Nonlocal gradients are known to exist for a wider class of functions than the standard gradient. This allows NTK theory to be extended to nonsmooth functions, stochastic estimators, and broader families of models. We explore both fixed-kernel and attention-based formulations of this nonlocal operator. We illustrate the new formulation with numerical studies.
NAJun 21, 2024
Marrying Compressed Sensing and Deep Signal SeparationTruman Hickok, Sriram Nagaraj
Blind signal separation (BSS) is an important and challenging signal processing task. Given an observed signal which is a superposition of a collection of unknown (hidden/latent) signals, BSS aims at recovering the separate, underlying signals from only the observed mixed signal. As an underdetermined problem, BSS is notoriously difficult to solve in general, and modern deep learning has provided engineers with an effective set of tools to solve this problem. For example, autoencoders learn a low-dimensional hidden encoding of the input data which can then be used to perform signal separation. In real-time systems, a common bottleneck is the transmission of data (communications) to a central command in order to await decisions. Bandwidth limits dictate the frequency and resolution of the data being transmitted. To overcome this, compressed sensing (CS) technology allows for the direct acquisition of compressed data with a near optimal reconstruction guarantee. This paper addresses the question: can compressive acquisition be combined with deep learning for BSS to provide a complete acquire-separate-predict pipeline? In other words, the aim is to perform BSS on a compressively acquired signal directly without ever having to decompress the signal. We consider image data (MNIST and E-MNIST) and show how our compressive autoencoder approach solves the problem of compressive BSS. We also provide some theoretical insights into the problem.
LGJun 21, 2024
Physics Informed Machine Learning (PIML) methods for estimating the remaining useful lifetime (RUL) of aircraft enginesSriram Nagaraj, Truman Hickok
This paper is aimed at using the newly developing field of physics informed machine learning (PIML) to develop models for predicting the remaining useful lifetime (RUL) aircraft engines. We consider the well-known benchmark NASA Commercial Modular Aero-Propulsion System Simulation (C-MAPSS) data as the main data for this paper, which consists of sensor outputs in a variety of different operating modes. C-MAPSS is a well-studied dataset with much existing work in the literature that address RUL prediction with classical and deep learning methods. In the absence of published empirical physical laws governing the C-MAPSS data, our approach first uses stochastic methods to estimate the governing physics models from the noisy time series data. In our approach, we model the various sensor readings as being governed by stochastic differential equations, and we estimate the corresponding transition density mean and variance functions of the underlying processes. We then augment LSTM (long-short term memory) models with the learned mean and variance functions during training and inferencing. Our PIML based approach is different from previous methods, and we use the data to first learn the physics. Our results indicate that PIML discovery and solutions methods are well suited for this problem and outperform previous data-only deep learning methods for this data set and task. Moreover, the framework developed herein is flexible, and can be adapted to other situations (other sensor modalities or combined multi-physics environments), including cases where the underlying physics is only partially observed or known.
LGJun 21, 2024
BrowNNe: Brownian Nonlocal Neurons & Activation FunctionsSriram Nagaraj, Truman Hickok
It is generally thought that the use of stochastic activation functions in deep learning architectures yield models with superior generalization abilities. However, a sufficiently rigorous statement and theoretical proof of this heuristic is lacking in the literature. In this paper, we provide several novel contributions to the literature in this regard. Defining a new notion of nonlocal directional derivative, we analyze its theoretical properties (existence and convergence). Second, using a probabilistic reformulation, we show that nonlocal derivatives are epsilon-sub gradients, and derive sample complexity results for convergence of stochastic gradient descent-like methods using nonlocal derivatives. Finally, using our analysis of the nonlocal gradient of Holder continuous functions, we observe that sample paths of Brownian motion admit nonlocal directional derivatives, and the nonlocal derivatives of Brownian motion are seen to be Gaussian processes with computable mean and standard deviation. Using the theory of nonlocal directional derivatives, we solve a highly nondifferentiable and nonconvex model problem of parameter estimation on image articulation manifolds. Using Brownian motion infused ReLU activation functions with the nonlocal gradient in place of the usual gradient during backpropagation, we also perform experiments on multiple well-studied deep learning architectures. Our experiments indicate the superior generalization capabilities of Brownian neural activation functions in low-training data regimes, where the use of stochastic neurons beats the deterministic ReLU counterpart.
NAJul 22, 2017
A spacetime DPG method for the Schrodinger equationLeszek Demkowicz, Jay Gopalakrishnan, Sriram Nagaraj et al.
A spacetime Discontinuous Petrov Galerkin (DPG) method for the linear time-dependent Schrodinger equation is proposed. The spacetime approach is particularly attractive for capturing irregular solutions. Motivated by the fact that some irregular Schrodinger solutions cannot be solutions of certain first order reformulations, the proposed spacetime method uses the second order Schrodinger operator. Two variational formulations are proved to be well-posed: a strong formulation (with no relaxation of the original equation) and a weak formulation (also called the ultraweak formulation, that transfers all derivatives onto test functions). The convergence of the DPG method based on the ultraweak formulation is investigated using an interpolation operator. A standalone appendix analyzes the ultraweak formulation for general differential operators. Reports of numerical experiments motivated by pulse propagation in dispersive optical fibers are also included.
NAMay 19, 2015
Orientation Embedded High Order Shape Functions for the Exact Sequence Elements of All ShapesFederico Fuentes, Brendan Keith, Leszek Demkowicz et al.
A unified construction of high order shape functions is given for all four classical energy spaces ($H^1$, $H(\mathrm{curl})$, $H(\mathrm{div})$ and $L^2$) and for elements of "all" shapes (segment, quadrilateral, triangle, hexahedron, tetrahedron, triangular prism and pyramid). The discrete spaces spanned by the shape functions satisfy the commuting exact sequence property for each element. The shape functions are conforming, hierarchical and compatible with other neighboring elements across shared boundaries so they may be used in hybrid meshes. Expressions for the shape functions are given in coordinate free format in terms of the relevant affine coordinates of each element shape. The polynomial order is allowed to differ for each separate topological entity (vertex, edge, face or interior) in the mesh, so the shape functions can be used to implement local $p$ adaptive finite element methods. Each topological entity may have its own orientation, and the shape functions can have that orientation embedded by a simple permutation of arguments.