MLFeb 11
A solvable high-dimensional model where nonlinear autoencoders learn structure invisible to PCA while test loss misaligns with generalizationVicente Conde Mendes, Lorenzo Bardone, Cédric Koller et al.
Many real-world datasets contain hidden structure that cannot be detected by simple linear correlations between input features. For example, latent factors may influence the data in a coordinated way, even though their effect is invisible to covariance-based methods such as PCA. In practice, nonlinear neural networks often succeed in extracting such hidden structure in unsupervised and self-supervised learning. However, constructing a minimal high-dimensional model where this advantage can be rigorously analyzed has remained an open theoretical challenge. We introduce a tractable high-dimensional spiked model with two latent factors: one visible to covariance, and one statistically dependent yet uncorrelated, appearing only in higher-order moments. PCA and linear autoencoders fail to recover the latter, while a minimal nonlinear autoencoder provably extracts both. We analyze both the population risk, and empirical risk minimization. Our model also provides a tractable example where self-supervised test loss is poorly aligned with representation quality: nonlinear autoencoders recover latent structure that linear methods miss, even though their reconstruction loss is higher.
CGAug 13, 2025
Counting Short Trajectories in Elementary Cellular Automata using the Transfer Matrix MethodCédric Koller, Barbora Hudcová
Elementary Cellular Automata (ECAs) exhibit diverse behaviours often categorized by Wolfram's qualitative classification. To provide a quantitative basis for understanding these behaviours, we investigate the global dynamics of such automata and we describe a method that allows us to compute the number of all configurations leading to short attractors in a limited number of time steps. This computation yields exact results in the thermodynamic limit (as the CA grid size grows to infinity), and is based on the Transfer Matrix Method (TMM) that we adapt for our purposes. Specifically, given two parameters $(p, c)$ we are able to compute the entropy of all initial configurations converging to an attractor of size $c$ after $p$ time-steps. By calculating such statistics for various ECA rules, we establish a quantitative connection between the entropy and the qualitative Wolfram classification scheme. Class 1 rules rapidly converge to maximal entropy for stationary states ($c=1$) as $p$ increases. Class 2 rules also approach maximal entropy quickly for appropriate cycle lengths $c$, potentially requiring consideration of translations. Class 3 rules exhibit zero or low finite entropy that saturates after a short transient. Class 4 rules show finite positive entropy, similar to some Class 3 rules. This method provides a precise framework for quantifying trajectory statistics, although its exponential computational cost in $p+c$ restricts practical analysis to short trajectories.