12.3STAT-MECHMar 30
How much of persistent homology is topology? A quantitative decomposition for spin model phase transitionsMatthew Loftus
Point-cloud persistent homology (PH) -- computing alpha or Rips complexes on spin-position point clouds -- has been widely applied to detect phase transitions in classical spin models since Donato et al. (2016), with subsequent studies attributing the detection to the topological content of the persistence diagram. We ask a simple question that has not been posed: what fraction of the PH signal is genuinely topological? We introduce f_topo, a quantitative decomposition that separates the density-driven and topological contributions to any PH statistic by comparing real spin configurations against density-matched shuffled null models. Across the 2D Ising model (system sizes L = 16-128, ten temperatures) and Potts models (q = 3, 5), we find that H_0 statistics -- total persistence, persistence entropy, feature count -- are 94-100% density-driven (f_topo < 0.07). The density-matched shuffled null detects T_c at the identical location and with comparable peak height as real configurations, showing that density alone is sufficient for phase transition detection. However, H_1 statistics are partially topological: the topological fraction grows with system size as delta(TP_{H_1}) ~ L^{0.53} and follows a finite-size scaling collapse delta(T, L) = L^{0.53} g(tL^{1/nu}) with collapse quality CV = 0.27. The longest persistence bar is strongly topological (f_topo > 1) and scales with the correlation length. A scale-resolved analysis reveals that the topological excess shifts from large-scale to small-scale features as L increases. We propose that the TDA-for-phase-transitions community adopt shuffled null models as standard practice, and that H_1 rather than H_0 statistics be used when genuine topological information is sought.
43.3STAT-MECHApr 1
The topological gap at criticality: scaling exponent d + η, universality, and scopeMatthew Loftus
The topological gap $Î= TP_{H_1}^{real} - TP_{H_1}^{shuf}$ -- the excess $H_1$ total persistence of the majority-spin alpha complex over a density-matched null -- encodes critical correlations in spin models. We establish finite-size scaling: $Î(L,T) = A L^{d+η} G_-(L|t/T_c|)$, with $G_-(x) \sim (1+x/x_0)^{-(1+β/ν)}$. For 2D Ising, $α= 2.249 \pm 0.038$, matching $d+η= 9/4$ to $0.03Ï$; the $G_-$ exponent $γ= 1.089 \pm 0.077$ is consistent with $1+β/ν= 9/8$ ($ÎR^2 < 10^{-5}$). For 2D Potts $q=3$ with $L$ up to 1024, $α= 2.272 \pm 0.024$ ($0.2Ï$ from $d+η= 2.267$), with two-term corrections to scaling ($R^2 = 0.9999$). The $G_-$ exponent $γ= 1.114$ (68% CI $[1.053, 1.173]$) matches $1+β/ν= 17/15$. Scope boundaries: the law fails for 2D Potts $q=4$ ($α= 2.347 \pm 0.017$, $9.3Ï$ from $d+η= 5/2$) where logarithmic corrections prevent convergence, and for raw 3D Ising ($4Ï$ from $d+η$), but density normalization $Î/|M|^{1/2}$ recovers $α= 3.06 \pm 0.04$ ($0.6Ï$). The framework fails for first-order, BKT, and percolation. The criterion: $α= d+η$ holds when corrections to scaling are algebraic ($Ï> 0$) but fails when logarithmic ($Ï\to 0$).
3.9LGMar 29
Spectral Signatures of Data Quality: Eigenvalue Tail Index as a Diagnostic for Label Noise in Neural NetworksMatthew Loftus
We investigate whether spectral properties of neural network weight matrices can predict test accuracy. Under controlled label noise variation, the tail index alpha of the eigenvalue distribution at the network's bottleneck layer predicts test accuracy with leave-one-out R^2 = 0.984 (21 noise levels, 3 seeds per level), far exceeding all baselines: the best conventional metric (Frobenius norm of the optimal layer) achieves LOO R^2 = 0.149. This relationship holds across three architectures (MLP, CNN, ResNet-18) and two datasets (MNIST, CIFAR-10). However, under hyperparameter variation at fixed data quality (180 configurations varying width, depth, learning rate, and weight decay), all spectral and conventional measures are weak predictors (R^2 < 0.25), with simple baselines (global L_2 norm, LOO R^2 = 0.219) slightly outperforming spectral measures (tail alpha, LOO R^2 = 0.167). We therefore frame the tail index as a data quality diagnostic: a powerful detector of label corruption and training set degradation, rather than a universal generalization predictor. A noise detector calibrated on synthetic noise successfully identifies real human annotation errors in CIFAR-10N (9% noise detected with 3% error). We identify the information-processing bottleneck layer as the locus of this signature and connect the observations to the BBP phase transition in spiked random matrix models. We also report a negative result: the level spacing ratio <r> is uninformative for weight matrices due to Wishart universality.
25.1MLMar 29
Persistence diagrams of random matrices via Morse theory: universality and a new spectral diagnosticMatthew Loftus
We prove that the persistence diagram of the sublevel set filtration of the quadratic form f(x) = x^T M x restricted to the unit sphere S^{n-1} is analytically determined by the eigenvalues of the symmetric matrix M. By Morse theory, the diagram has exactly n-1 finite bars, with the k-th bar living in homological dimension k-1 and having length equal to the k-th eigenvalue spacing s_k = λ_{k+1} - λ_k. This identification transfers random matrix theory (RMT) universality to persistence diagram universality: for matrices drawn from the Gaussian Orthogonal Ensemble (GOE), we derive the closed-form persistence entropy PE = log(8n/Ï) - 1, and verify numerically that the coefficient of variation of persistence statistics decays as n^{-0.6}. Different random matrix ensembles (GOE, GUE, Wishart) produce distinct universal persistence diagrams, providing topological fingerprints of RMT universality classes. As a practical consequence, we show that persistence entropy outperforms the standard level spacing ratio \langle r \rangle for discriminating GOE from GUE matrices (AUC 0.978 vs. 0.952 at n = 100, non-overlapping bootstrap 95% CIs), and detects global spectral perturbations in the Rosenzweig-Porter model to which \langle r \rangle is blind. These results establish persistence entropy as a new spectral diagnostic that captures complementary information to existing RMT tools.