ML LG NCSep 30, 2025

Estimating Dimensionality of Neural Representations from Finite Samples

Chanwoo Chun, Abdulkadir Canatar, SueYeon Chung, Daniel Lee

arXiv:2509.26560v11 citationsh-index: 8

Originality Incremental advance

AI Analysis

This addresses a methodological bottleneck for researchers analyzing neural data from finite samples, though it is incremental as it improves an existing measure.

The paper tackles the problem that existing measures of global dimensionality in neural representations are biased with small sample sizes, and proposes a bias-corrected estimator that recovers true dimensionality in synthetic data and is invariant to sample size in applications like brain recordings and large language models.

The global dimensionality of a neural representation manifold provides rich insight into the computational process underlying both artificial and biological neural networks. However, all existing measures of global dimensionality are sensitive to the number of samples, i.e., the number of rows and columns of the sample matrix. We show that, in particular, the participation ratio of eigenvalues, a popular measure of global dimensionality, is highly biased with small sample sizes, and propose a bias-corrected estimator that is more accurate with finite samples and with noise. On synthetic data examples, we demonstrate that our estimator can recover the true known dimensionality. We apply our estimator to neural brain recordings, including calcium imaging, electrophysiological recordings, and fMRI data, and to the neural activations in a large language model and show our estimator is invariant to the sample size. Finally, our estimators can additionally be used to measure the local dimensionalities of curved neural manifolds by weighting the finite samples appropriately.

View on arXiv PDF

Similar