NAFeb 6, 2019
On maximum volume submatrices and cross approximation for symmetric semidefinite and diagonally dominant matricesAlice Cortinovis, Daniel Kressner, Stefano Massei
The problem of finding a $k \times k$ submatrix of maximum volume of a matrix $A$ is of interest in a variety of applications. For example, it yields a quasi-best low-rank approximation constructed from the rows and columns of $A$. We show that such a submatrix can always be chosen to be a principal submatrix if $A$ is symmetric semidefinite or diagonally dominant. Then we analyze the low-rank approximation error returned by a greedy method for volume maximization, cross approximation with complete pivoting. Our bound for general matrices extends an existing result for symmetric semidefinite matrices and yields new error estimates for diagonally dominant matrices. In particular, for doubly diagonally dominant matrices the error is shown to remain within a modest factor of the best approximation error. We also illustrate how the application of our results to cross approximation for functions leads to new and better convergence results.
4.8PRMay 7
Computing the density of the Kesten-Stigum limit in supercritical Galton-Watson processesAlice Cortinovis, Sophie Hautphenne, Stefano Massei
This paper proposes a novel numerical method for computing the density of the limit random variable associated with a supercritical Galton-Watson process. This random variable captures the effect of early demographic fluctuations and determines the random amplitude of long-term exponential population growth. While the existence of a non-trivial limit is ensured by the Kesten-Stigum theorem, computing its density in a stable and efficient manner for arbitrary offspring laws remains a significant challenge. The proposed approach leverages a functional equation that characterizes the Laplace-Stieltjes transform of the limit distribution and combines it with a moment-matching method to obtain accurate approximations within a class of linear combinations of Laguerre polynomials with exponential damping. The effectiveness of the approach is validated on several examples in which the offspring generating function is a polynomial of bounded degree.
82.6NAApr 2
Attention Mechanisms Through the Lens of Numerical Methods: Approximation Methods and Alternative FormulationsMichel Fabrice Serret, Alice Cortinovis, Yijun Dong et al.
The attention mechanism is the computational core of modern Transformer architectures, but its quadratic complexity in the input sequence length is the bottleneck for large-scale inference. This has motivated a rapidly growing body of work aimed at accelerating attention through approximation and reformulation. In this survey, we revisit attention mechanisms through the lens of numerical analysis, with a particular emphasis on tools and perspectives from numerical linear algebra. Our goal is twofold: first, we aim to systematically review and classify fast approximation methods according to the numerical principles they exploit. These include sparsity and clustering approaches, low-rank and subspace projection techniques, randomized sketching methods, and tensor-based decompositions. We also discuss kernel-inspired reformulations of attention and recent architectural variants, such as Latent Attention, that modify the standard softmax formulation to improve efficiency. Second, by presenting these developments within a unified mathematical framework, we aim to bridge the gap between disciplines and highlight opportunities for further contributions from computational mathematics, particularly numerical linear algebra, to the design of scalable attention mechanisms.