NAAug 28, 2018
Data driven Koopman spectral analysis in Vandermonde-Cauchy form via the DFT: numerical method and theoretical insightsZlatko Drmač, Igor Mezić, Ryan Mohr
The goals and contributions of this paper are twofold. It provides a new computational tool for data driven Koopman spectral analysis by taking up the formidable challenge to develop a numerically robust algorithm by following the natural formulation via the Krylov decomposition with the Frobenius companion matrix, and by using its eigenvectors explicitly -- these are defined as the inverse of the notoriously ill-conditioned Vandermonde matrix. The key step to curb ill-conditioning is the discrete Fourier transform of the snapshots; in the new representation, the Vandermonde matrix is transformed into a generalized Cauchy matrix, which then allows accurate computation by specially tailored algorithms of numerical linear algebra. The second goal is to shed light on the connection between the formulas for optimal reconstruction weights when reconstructing snapshots using subsets of the computed Koopman modes. It is shown how using a certain weaker form of generalized inverses leads to explicit reconstruction formulas that match the abstract results from Koopman spectral theory, in particular the Generalized Laplace Analysis.
LGFeb 17, 2023
Identifying Equivalent Training DynamicsWilliam T. Redman, Juan M. Bello-Rivas, Maria Fonoberova et al.
Study of the nonlinear evolution deep neural network (DNN) parameters undergo during training has uncovered regimes of distinct dynamical behavior. While a detailed understanding of these phenomena has the potential to advance improvements in training efficiency and robustness, the lack of methods for identifying when DNN models have equivalent dynamics limits the insight that can be gained from prior work. Topological conjugacy, a notion from dynamical systems theory, provides a precise definition of dynamical equivalence, offering a possible route to address this need. However, topological conjugacies have historically been challenging to compute. By leveraging advances in Koopman operator theory, we develop a framework for identifying conjugate and non-conjugate training dynamics. To validate our approach, we demonstrate that comparing Koopman eigenvalues can correctly identify a known equivalence between online mirror descent and online gradient descent. We then utilize our approach to: (a) identify non-conjugate training dynamics between shallow and wide fully connected neural networks; (b) characterize the early phase of training dynamics in convolutional neural networks; (c) uncover non-conjugate training dynamics in Transformers that do and do not undergo grokking. Our results, across a range of DNN architectures, illustrate the flexibility of our framework and highlight its potential for shedding new light on training dynamics.
LGOct 28, 2021
An Operator Theoretic View on Pruning Deep Neural NetworksWilliam T. Redman, Maria Fonoberova, Ryan Mohr et al.
The discovery of sparse subnetworks that are able to perform as well as full models has found broad applied and theoretical interest. While many pruning methods have been developed to this end, the naïve approach of removing parameters based on their magnitude has been found to be as robust as more complex, state-of-the-art algorithms. The lack of theory behind magnitude pruning's success, especially pre-convergence, and its relation to other pruning methods, such as gradient based pruning, are outstanding open questions in the field that are in need of being addressed. We make use of recent advances in dynamical systems theory, namely Koopman operator theory, to define a new class of theoretically motivated pruning algorithms. We show that these algorithms can be equivalent to magnitude and gradient based pruning, unifying these seemingly disparate methods, and find that they can be used to shed light on magnitude pruning's performance during the early part of training.
LGDec 21, 2020
Predicting the Critical Number of Layers for Hierarchical Support Vector RegressionRyan Mohr, Maria Fonoberova, Zlatko Drmač et al.
Hierarchical support vector regression (HSVR) models a function from data as a linear combination of SVR models at a range of scales, starting at a coarse scale and moving to finer scales as the hierarchy continues. In the original formulation of HSVR, there were no rules for choosing the depth of the model. In this paper, we observe in a number of models a phase transition in the training error -- the error remains relatively constant as layers are added, until a critical scale is passed, at which point the training error drops close to zero and remains nearly constant for added layers. We introduce a method to predict this critical scale a priori with the prediction based on the support of either a Fourier transform of the data or the Dynamic Mode Decomposition (DMD) spectrum. This allows us to determine the required number of layers prior to training any models.
LGJun 21, 2020
Applications of Koopman Mode Analysis to Neural NetworksIva Manojlović, Maria Fonoberova, Ryan Mohr et al.
We consider the training process of a neural network as a dynamical system acting on the high-dimensional weight space. Each epoch is an application of the map induced by the optimization algorithm and the loss function. Using this induced map, we can apply observables on the weight space and measure their evolution. The evolution of the observables are given by the Koopman operator associated with the induced dynamical system. We use the spectrum and modes of the Koopman operator to realize the above objectives. Our methods can help to, a priori, determine the network depth; determine if we have a bad initialization of the network weights, allowing a restart before training too long; speeding up the training time. Additionally, our methods help enable noise rejection and improve robustness. We show how the Koopman spectrum can be used to determine the number of layers required for the architecture. Additionally, we show how we can elucidate the convergence versus non-convergence of the training process by monitoring the spectrum, in particular, how the existence of eigenvalues clustering around 1 determines when to terminate the learning process. We also show how using Koopman modes we can selectively prune the network to speed up the training procedure. Finally, we show that incorporating loss functions based on negative Sobolev norms can allow for the reconstruction of a multi-scale signal polluted by very large amounts of noise.
NAAug 9, 2017
Data driven modal decompositions: analysis and enhancementsZlatko Drmač, Igor Mezić, Ryan Mohr
The Dynamic Mode Decomposition (DMD) is a tool of trade in computational data driven analysis of fluid flows. More generally, it is a computational device for Koopman spectral analysis of nonlinear dynamical systems, with a plethora of applications in applied sciences and engineering. Its exceptional performance triggered developments of several modifications that make the DMD an attractive method in data driven framework. This work offers further improvements of the DMD to make it more reliable, and to enhance its functionality. In particular, data driven formula for the residuals allows selection of the Ritz pairs, thus providing more precise spectral information of the underlying Koopman operator, and the well-known technique of refining the Ritz vectors is adapted to data driven scenarios. Further, the DMD is formulated in a more general setting of weighted inner product spaces, and the consequences for numerical computation are discussed in detail. Numerical experiments are used to illustrate the advantages of the proposed method, designated as DDMD_RRR (Refined Rayleigh Ritz Data Driven Modal Decomposition).