OCApr 29
Man, Machine, and MathematicsAkshunna S. Dogra
Nonlinear models and optimization methods have successfully tackled a rapidly growing set of problems in recent years. Indeed, a relatively small toolbox of such models and methods can provide sufficient performance across a large landscape of tasks: deep learning alone has made significant recent contributions in scientific modelling, natural language processing, visual analysis, etc. A similar relationship exists between physical theories and phenomena, where many applications and observations emerge neatly from remarkably minimal foundations. It is natural to wonder if sparse unified frameworks could be built to steer discussion and discovery in the fields concerned with learning, optimization, and modelling. In this work, we posit and examine a possible outline for such a unified theory, interpreting the notion of ''learning'' in a broad sense. In particular, we pursue our goals by viewing learning as an inter-connected process on multiple levels: problem setup, choosing methods, and the analysis of their interplay via imposed optimisation dynamics. We begin by proposing a precise yet versatile definition for ''solvable'' problems. We then define the ''parametrised methods'' by which their solution(s) may be ''learned''. Our goal is to sketch a ''universal convergence theorem'', specifying how and when solvable problems become amenable to the methods chosen for them. We find these constructions reduce the study of learning down to remarkably few ideas and tools - many of which are simply adapted from existing ones in dynamical systems theory, geometry, and fundamental physics.
LGOct 22, 2025
FINDER: Feature Inference on Noisy Datasets using Eigenspace ResidualsTrajan Murphy, Akshunna S. Dogra, Hanfeng Gu et al.
''Noisy'' datasets (regimes with low signal to noise ratios, small sample sizes, faulty data collection, etc) remain a key research frontier for classification methods with both theoretical and practical implications. We introduce FINDER, a rigorous framework for analyzing generic classification problems, with tailored algorithms for noisy datasets. FINDER incorporates fundamental stochastic analysis ideas into the feature learning and inference stages to optimally account for the randomness inherent to all empirical datasets. We construct ''stochastic features'' by first viewing empirical datasets as realizations from an underlying random field (without assumptions on its exact distribution) and then mapping them to appropriate Hilbert spaces. The Kosambi-Karhunen-Loéve expansion (KLE) breaks these stochastic features into computable irreducible components, which allow classification over noisy datasets via an eigen-decomposition: data from different classes resides in distinct regions, identified by analyzing the spectrum of the associated operators. We validate FINDER on several challenging, data-deficient scientific domains, producing state of the art breakthroughs in: (i) Alzheimer's Disease stage classification, (ii) Remote sensing detection of deforestation. We end with a discussion on when FINDER is expected to outperform existing methods, its failure modes, and other limitations.
LGOct 7, 2021
Universality of Winning Tickets: A Renormalization Group PerspectiveWilliam T. Redman, Tianlong Chen, Zhangyang Wang et al.
Foundational work on the Lottery Ticket Hypothesis has suggested an exciting corollary: winning tickets found in the context of one task can be transferred to similar tasks, possibly even across different architectures. This has generated broad interest, but methods to study this universality are lacking. We make use of renormalization group theory, a powerful tool from theoretical physics, to address this need. We find that iterative magnitude pruning, the principal algorithm used for discovering winning tickets, is a renormalization group scheme, and can be viewed as inducing a flow in parameter space. We demonstrate that ResNet-50 models with transferable winning tickets have flows with common properties, as would be expected from the theory. Similar observations are made for BERT models, with evidence that their flows are near fixed points. Additionally, we leverage our framework to study winning tickets transferred across ResNet architectures, observing that smaller models have flows with more uniform properties than larger models, complicating transfer between them.
LGAug 24, 2020
Local error quantification for Neural Network Differential Equation solversAkshunna S. Dogra, William T Redman
Neural networks have been identified as powerful tools for the study of complex systems. A noteworthy example is the neural network differential equation (NN DE) solver, which can provide functional approximations to the solutions of a wide variety of differential equations. Such solvers produce robust functional expressions, are well suited for further manipulations on the quantities of interest (for example, taking derivatives), and capable of leveraging the modern advances in parallelization and computing power. However, there is a lack of work on the role precise error quantification can play in their predictions: usually, the focus is on ambiguous and/or global measures of performance like the loss function and/or obtaining global bounds on the errors associated with the predictions. Precise, local error quantification is seldom possible without external means or outright knowledge of the true solution. We address these concerns in the context of dynamical system NN DE solvers, leveraging learnt information within the NN DE solvers to develop methods that allow them to be more accurate and efficient, while still pursuing an unsupervised approach that does not rely on external tools or data. We achieve this via methods that can precisely estimate NN DE solver prediction errors point-wise, thus allowing the user the capacity for efficient and targeted error correction. We exemplify the utility of our methods by testing them on a nonlinear and a chaotic system each.
NAJul 9, 2020
Error Estimation and Correction from within Neural Network Differential Equation SolversAkshunna S. Dogra
Neural Network Differential Equation (NN DE) solvers have surged in popularity due to a combination of factors: computational advances making their optimization more tractable, their capacity to handle high dimensional problems, easy interpret-ability of their models, etc. However, almost all NN DE solvers suffer from a fundamental limitation: they are trained using loss functions that depend only implicitly on the error associated with the estimate. As such, validation and error analysis of solution estimates requires knowledge of the true solution. Indeed, if the true solution is unknown, we are often reduced to simply hoping that a "low enough" loss implies "small enough" errors, since explicit relationships between the two are not available/well defined. In this work, we describe a general strategy for efficiently constructing error estimates and corrections for Neural Network Differential Equation solvers. Our methods do not require advance knowledge of the true solutions and obtain explicit relationships between loss functions and the error associated with solution estimates. In turn, these explicit relationships directly allow us to estimate and correct for the errors.
NEJun 3, 2020
Optimizing Neural Networks via Koopman Operator TheoryAkshunna S. Dogra, William T Redman
Koopman operator theory, a powerful framework for discovering the underlying dynamics of nonlinear dynamical systems, was recently shown to be intimately connected with neural network training. In this work, we take the first steps in making use of this connection. As Koopman operator theory is a linear theory, a successful implementation of it in evolving network weights and biases offers the promise of accelerated training, especially in the context of deep networks, where optimization is inherently a non-convex problem. We show that Koopman operator theoretic methods allow for accurate predictions of weights and biases of feedforward, fully connected deep networks over a non-trivial range of training time. During this window, we find that our approach is >10x faster than various gradient descent based methods (e.g. Adam, Adadelta, Adagrad), in line with our complexity analysis. We end by highlighting open questions in this exciting intersection between dynamical systems and neural network theory. We highlight additional methods by which our results could be expanded to broader classes of networks and larger training intervals, which shall be the focus of future work.
COMP-PHJan 29, 2020
Hamiltonian neural networks for solving equations of motionMarios Mattheakis, David Sondak, Akshunna S. Dogra et al.
There has been a wave of interest in applying machine learning to study dynamical systems. We present a Hamiltonian neural network that solves the differential equations that govern dynamical systems. This is an equation-driven machine learning method where the optimization process of the network depends solely on the predicted functions without using any ground truth data. The model learns solutions that satisfy, up to an arbitrarily small error, Hamilton's equations and, therefore, conserve the Hamiltonian invariants. The choice of an appropriate activation function drastically improves the predictability of the network. Moreover, an error analysis is derived and states that the numerical errors depend on the overall network performance. The Hamiltonian network is then employed to solve the equations for the nonlinear oscillator and the chaotic Henon-Heiles dynamical system. In both systems, a symplectic Euler integrator requires two orders more evaluation points than the Hamiltonian network in order to achieve the same order of the numerical error in the predicted phase space trajectories.