Felipe Hernández

LGFeb 27, 2023

Analyzing Populations of Neural Networks via Dynamical Model Embedding

Jordan Cotler, Kai Sheng Tai, Felipe Hernández et al. · stanford

A core challenge in the interpretation of deep neural networks is identifying commonalities between the underlying algorithms implemented by distinct networks trained for the same task. Motivated by this problem, we introduce DYNAMO, an algorithm that constructs low-dimensional manifolds where each point corresponds to a neural network model, and two points are nearby if the corresponding neural networks enact similar high-level computational processes. DYNAMO takes as input a collection of pre-trained neural networks and outputs a meta-model that emulates the dynamics of the hidden states as well as the outputs of any model in the collection. The specific model to be emulated is determined by a model embedding vector that the meta-model takes as input; these model embedding vectors constitute a manifold corresponding to the given population of models. We apply DYNAMO to both RNNs and CNNs, and find that the resulting model embedding spaces enable novel applications: clustering of neural networks on the basis of their high-level computational processes in a manner that is less sensitive to reparameterization; model averaging of several neural networks trained on the same task to arrive at a new, operable neural network with similar task performance; and semi-supervised learning via optimization on the model embedding space. Using a fixed-point analysis of meta-models trained on populations of RNNs, we gain new insights into how similarities of the topology of RNN dynamics correspond to similarities of their high-level computational processes.

NAApr 24, 2017

Scalable computation of Jordan chains

Felipe Hernández, Adi Pick, Steven G. Johnson

We present an algorithm to compute the Jordan chain of a nearly defective matrix with a $2\times2$ Jordan block. The algorithm is based on an inverse-iteration procedure and only needs information about the invariant subspace corresponding to the Jordan chain, making it suitable for use with large matrices arising in applications, in contrast with existing algorithms which rely on an SVD. The algorithm produces the eigenvector and Jordan vector with $O(\varepsilon)$ error, with $\varepsilon$ being the distance of the given matrix to an exactly defective matrix. As an example, we demonstrate the use of this algorithm in a problem arising from electromagnetism, in which the matrix has size $212^2\times 212^2$. An extension of this algorithm is also presented which can achieve higher order convergence [$O(\varepsilon^2)$] when the matrix derivative is known.

Felipe Hernández

2 Papers