LG NAOct 22, 2025

Matrix-Free Least Squares Solvers: Values, Gradients, and What to Do With Them

Hrittik Roy, Søren Hauberg, Nicholas Krämer

arXiv:2510.19634v17.11 citationsh-index: 8

Originality Highly original

AI Analysis

This work provides a novel tool for machine learning practitioners to integrate least squares as a differentiable component, potentially enhancing model flexibility and performance in various domains.

The paper tackles the underutilization of least squares in machine learning by deriving custom gradients to make it a differentiable operator, enabling applications like enforcing sparsity in large models, imposing constraints in generative models, and tuning hyperparameters in Gaussian processes.

This paper argues that the method of least squares has significant unfulfilled potential in modern machine learning, far beyond merely being a tool for fitting linear models. To release its potential, we derive custom gradients that transform the solver into a differentiable operator, like a neural network layer, enabling many diverse applications. Empirically, we demonstrate: (i) scalability by enforcing weight sparsity on a 50 million parameter model; (ii) imposing conservativeness constraints in score-based generative models; and (iii) hyperparameter tuning of Gaussian processes based on predictive performance. By doing this, our work represents the next iteration in developing differentiable linear-algebra tools and making them widely accessible to machine learning practitioners.

View on arXiv PDF

Similar