MLLGNAMar 6, 2025

Determinant Estimation under Memory Constraints and Neural Scaling Laws

arXiv:2503.04424v23 citationsh-index: 13ICML
AI Analysis

This addresses a memory bottleneck in large-scale machine learning tasks like kernel methods, offering a practical solution for previously intractable problems, though it is incremental in combining existing techniques with new scaling insights.

The paper tackles the problem of estimating log-determinants of large matrices under memory constraints, particularly for neural tangent kernels, by deriving a hierarchical algorithm and scaling laws that enable a ~100,000x speedup with improved accuracy using only a tiny fraction of the full dataset.

Calculating or accurately estimating log-determinants of large positive definite matrices is of fundamental importance in many machine learning tasks. While its cubic computational complexity can already be prohibitive, in modern applications, even storing the matrices themselves can pose a memory bottleneck. To address this, we derive a novel hierarchical algorithm based on block-wise computation of the LDL decomposition for large-scale log-determinant calculation in memory-constrained settings. In extreme cases where matrices are highly ill-conditioned, accurately computing the full matrix itself may be infeasible. This is particularly relevant when considering kernel matrices at scale, including the empirical Neural Tangent Kernel (NTK) of neural networks trained on large datasets. Under the assumption of neural scaling laws in the test error, we show that the ratio of pseudo-determinants satisfies a power-law relationship, allowing us to derive corresponding scaling laws. This enables accurate estimation of NTK log-determinants from a tiny fraction of the full dataset; in our experiments, this results in a $\sim$100,000$\times$ speedup with improved accuracy over competing approximations. Using these techniques, we successfully estimate log-determinants for dense matrices of extreme sizes, which were previously deemed intractable and inaccessible due to their enormous scale and computational demands.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes