NAApr 2
Linear Systems and Eigenvalue Problems: Open Questions from a Simons WorkshopNoah Amsel, Yves Baumann, Paul Beckman et al. · berkeley
This document presents a series of open questions arising in matrix computations, i.e., the numerical solution of linear algebra problems. It is a result of working groups at the workshop Linear Systems and Eigenvalue Problems, which was organized at the Simons Institute for the Theory of Computing program on Complexity and Linear Algebra in Fall 2025. The complexity and numerical solution of linear algebra problems is a crosscutting area between theoretical computer science and numerical analysis. The value of the particular problem formulations here is that they were produced via discussions between researchers from both groups. The open questions are organized in five categories: iterative solvers for linear systems, eigenvalue computation, low-rank approximation, randomized sketching, and other areas including tensors, quantum systems, and matrix functions.
NAMay 31
Transpose-free linear algebraDiana Halikias, Michiel E. Hochstenbach, Alex Townsend
We study the limitations of matrix-free algorithms that access a matrix $A$ only through forward matrix-vector products (matvecs) $x \mapsto Ax$, without access to the transpose $A^\top$ or its action. This setting arises naturally in operator learning, inverse problems, and matrix-free PDE solvers, where adjoint evaluations may be unavailable or prohibitively expensive. We show that the lack of transpose access creates severe and sometimes insurmountable theoretical barriers. For Krylov methods, we prove that the sequence of projected operator norms produced by Arnoldi iteration can follow any prescribed nondecreasing curve, showing that forward matvecs alone provide essentially no reliable information about the spectral norm. For several core problems, including least squares, norm estimation, column subset selection, and local maximum volume, we establish non-identifiability results; distinct matrices can generate identical forward-query transcripts while having fundamentally different solutions. We also prove quantitative lower bounds on the number of forward matvecs required for approximation tasks. In particular, any algorithm that computes a near-optimal rank-$k$ approximation must use at least $n$ queries, and estimating the Frobenius norm to relative accuracy $\eps$ requires $Ω(\eps^{-2})$ queries when $n$ is sufficiently large, matching the complexity of Hutchinson-type estimators up to constants. Although some problems remain solvable without transpose access, the transpose-free setting is fundamentally more limited in both identifiability and efficiency.
LGFeb 24, 2023
Elliptic PDE learning is provably data-efficientNicolas Boullé, Diana Halikias, Alex Townsend
PDE learning is an emerging field that combines physics and machine learning to recover unknown physical systems from experimental data. While deep learning models traditionally require copious amounts of training data, recent PDE learning techniques achieve spectacular results with limited data availability. Still, these results are empirical. Our work provides theoretical guarantees on the number of input-output training pairs required in PDE learning. Specifically, we exploit randomized numerical linear algebra and PDE theory to derive a provably data-efficient algorithm that recovers solution operators of 3D uniformly elliptic PDEs from input-output data and achieves an exponential convergence rate of the error with respect to the size of the training dataset with an exceptionally high probability of success.
NAApr 2
Attention Mechanisms Through the Lens of Numerical Methods: Approximation Methods and Alternative FormulationsMichel Fabrice Serret, Alice Cortinovis, Yijun Dong et al.
The attention mechanism is the computational core of modern Transformer architectures, but its quadratic complexity in the input sequence length is the bottleneck for large-scale inference. This has motivated a rapidly growing body of work aimed at accelerating attention through approximation and reformulation. In this survey, we revisit attention mechanisms through the lens of numerical analysis, with a particular emphasis on tools and perspectives from numerical linear algebra. Our goal is twofold: first, we aim to systematically review and classify fast approximation methods according to the numerical principles they exploit. These include sparsity and clustering approaches, low-rank and subspace projection techniques, randomized sketching methods, and tensor-based decompositions. We also discuss kernel-inspired reformulations of attention and recent architectural variants, such as Latent Attention, that modify the standard softmax formulation to improve efficiency. Second, by presenting these developments within a unified mathematical framework, we aim to bridge the gap between disciplines and highlight opportunities for further contributions from computational mathematics, particularly numerical linear algebra, to the design of scalable attention mechanisms.
NAJan 31, 2024
Operator learning without the adjointNicolas Boullé, Diana Halikias, Samuel E. Otto et al.
There is a mystery at the heart of operator learning: how can one recover a non-self-adjoint operator from data without probing the adjoint? Current practical approaches suggest that one can accurately recover an operator while only using data generated by the forward action of the operator without access to the adjoint. However, naively, it seems essential to sample the action of the adjoint. In this paper, we partially explain this mystery by proving that without querying the adjoint, one can approximate a family of non-self-adjoint infinite-dimensional compact operators via projection onto a Fourier basis. We then apply the result to recovering Green's functions of elliptic partial differential operators and derive an adjoint-free sample complexity bound. While existing theory justifies low sample complexity in operator learning, ours is the first adjoint-free analysis that attempts to close the gap between theory and practice.
LGSep 23, 2021
Arbitrary-Depth Universal Approximation Theorems for Operator Neural NetworksAnnan Yu, Chloé Becquey, Diana Halikias et al.
The standard Universal Approximation Theorem for operator neural networks (NNs) holds for arbitrary width and bounded depth. Here, we prove that operator NNs of bounded width and arbitrary depth are universal approximators for continuous nonlinear operators. In our main result, we prove that for non-polynomial activation functions that are continuously differentiable at a point with a nonzero derivative, one can construct an operator NN of width five, whose inputs are real numbers with finite decimal representations, that is arbitrarily close to any given continuous nonlinear operator. We derive an analogous result for non-affine polynomial activation functions. We also show that depth has theoretical advantages by constructing operator ReLU NNs of depth $2k^3+8$ and constant width that cannot be well-approximated by any operator ReLU NN of depth $k$, unless its width is exponential in $k$.