LGOct 9, 2025

Who Said Neural Networks Aren't Linear?

arXiv:2510.08570v13 citationsh-index: 12
Originality Highly original
AI Analysis

This work addresses a foundational problem in machine learning by enabling linear algebra techniques for nonlinear models, with potential broad impact across AI domains.

The paper tackles the problem of applying linear algebra tools to nonlinear neural networks by introducing Linearizers, which make networks linear relative to constructed vector spaces, and demonstrates that this framework reduces diffusion model sampling from hundreds of steps to one step.

Neural networks are famously nonlinear. However, linearity is defined relative to a pair of vector spaces, $f$$:$$X$$\to$$Y$. Is it possible to identify a pair of non-standard vector spaces for which a conventionally nonlinear function is, in fact, linear? This paper introduces a method that makes such vector spaces explicit by construction. We find that if we sandwich a linear operator $A$ between two invertible neural networks, $f(x)=g_y^{-1}(A g_x(x))$, then the corresponding vector spaces $X$ and $Y$ are induced by newly defined addition and scaling actions derived from $g_x$ and $g_y$. We term this kind of architecture a Linearizer. This framework makes the entire arsenal of linear algebra, including SVD, pseudo-inverse, orthogonal projection and more, applicable to nonlinear mappings. Furthermore, we show that the composition of two Linearizers that share a neural network is also a Linearizer. We leverage this property and demonstrate that training diffusion models using our architecture makes the hundreds of sampling steps collapse into a single step. We further utilize our framework to enforce idempotency (i.e. $f(f(x))=f(x)$) on networks leading to a globally projective generative model and to demonstrate modular style transfer.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes