40.7LGMar 21
Neural collapse in the orthoplex regimeJames Alcala, Rayna Andreeva, Vladimir A. Kobzar et al.
When training a neural network for classification, the feature vectors of the training set are known to collapse to the vertices of a regular simplex, provided the dimension $d$ of the feature space and the number $n$ of classes satisfies $n\leq d+1$. This phenomenon is known as neural collapse. For other applications like language models, one instead takes $n\gg d$. Here, the neural collapse phenomenon still occurs, but with different emergent geometric figures. We characterize these geometric figures in the orthoplex regime where $d+2\leq n\leq 2d$. The techniques in our analysis primarily involve Radon's theorem and convexity.
LGJan 18, 2023
On the limits of neural network explainability via descramblingShashank Sule, Richard G. Spencer, Wojciech Czaja
We characterize the exact solutions to neural network descrambling--a mathematical model for explaining the fully connected layers of trained neural networks (NNs). By reformulating the problem to the minimization of the Brockett function arising in graph matching and complexity theory we show that the principal components of the hidden layer preactivations can be characterized as the optimal explainers or descramblers for the layer weights, leading to descrambled weight matrices. We show that in typical deep learning contexts these descramblers take diverse and interesting forms including (1) matching largest principal components with the lowest frequency modes of the Fourier basis for isotropic hidden data, (2) discovering the semantic development in two-layer linear NNs for signal recovery problems, and (3) explaining CNNs by optimally permuting the neurons. Our numerical experiments indicate that the eigendecompositions of the hidden layer data--now understood as the descramblers--can also reveal the layer's underlying transformation. These results illustrate that the SVD is more directly related to the explainability of NNs than previously thought and offers a promising avenue for discovering interpretable motifs for the hidden action of NNs, especially in contexts of operator learning or physics-informed NNs, where the input/output data has limited human readability.
NADec 22, 2023
Sharp error estimates for target measure diffusion maps with applications to the committor problemShashank Sule, Luke Evans, Maria Cameron
We obtain asymptotically sharp error estimates for the consistency error of the Target Measure Diffusion map (TMDmap) (Banisch et al. 2020), a variant of diffusion maps featuring importance sampling and hence allowing input data drawn from an arbitrary density. The derived error estimates include the bias error and the variance error. The resulting convergence rates are consistent with the approximation theory of graph Laplacians. The key novelty of our results lies in the explicit quantification of all the prefactors on leading-order terms. We also prove an error estimate for solutions of Dirichlet BVPs obtained using TMDmap, showing that the solution error is controlled by consistency error. We use these results to study an important application of TMDmap in the analysis of rare events in systems governed by overdamped Langevin dynamics using the framework of transition path theory (TPT). The cornerstone ingredient of TPT is the solution of the committor problem, a boundary value problem for the backward Kolmogorov PDE. Remarkably, we find that the TMDmap algorithm is particularly suited as a meshless solver to the committor problem due to the cancellation of several error terms in the prefactor formula. Furthermore, significant improvements in bias and variance errors occur when using a quasi-uniform sampling density. Our numerical experiments show that these improvements in accuracy are realizable in practice when using $δ$-nets as spatially uniform inputs to the TMDmap algorithm.
STFeb 10, 2025
Neumann eigenmaps for landmark embeddingShashank Sule, Wojciech Czaja
We present Neumann eigenmaps (NeuMaps), a novel approach for enhancing the standard diffusion map embedding using landmarks, i.e distinguished samples within the dataset. By interpreting these landmarks as a subgraph of the larger data graph, NeuMaps are obtained via the eigendecomposition of a renormalized Neumann Laplacian. We show that NeuMaps offer two key advantages: (1) they provide a computationally efficient embedding that accurately recovers the diffusion distance associated with the reflecting random walk on the subgraph, and (2) they naturally incorporate the Nyström extension within the diffusion map framework through the discrete Neumann boundary condition. Through examples in digit classification and molecular dynamics, we demonstrate that NeuMaps not only improve upon existing landmark-based embedding methods but also enhance the stability of diffusion map embeddings to the removal of highly significant points.