MLLGDGPRNov 17, 2025

Reconstruction of Manifold Distances from Noisy Observations

arXiv:2511.13025v11 citationsh-index: 2
Originality Incremental advance
AI Analysis

This addresses the challenge of robust distance estimation in manifold learning for data analysis, with incremental improvements over prior work by relaxing noise assumptions and handling missing observations.

The paper tackles the problem of reconstructing intrinsic manifold distances from noisy pairwise observations, achieving recovery up to an additive error of O(ε log ε⁻¹) under mild geometric assumptions, with sample complexity N ≍ ε⁻²ᵈ⁻² log(1/ε) and runtime o(N³).

We consider the problem of reconstructing the intrinsic geometry of a manifold from noisy pairwise distance observations. Specifically, let $M$ denote a diameter 1 d-dimensional manifold and $μ$ a probability measure on $M$ that is mutually absolutely continuous with the volume measure. Suppose $X_1,\dots,X_N$ are i.i.d. samples of $μ$ and we observe noisy-distance random variables $d'(X_j, X_k)$ that are related to the true geodesic distances $d(X_j,X_k)$. With mild assumptions on the distributions and independence of the noisy distances, we develop a new framework for recovering all distances between points in a sufficiently dense subsample of $M$. Our framework improves on previous work which assumed i.i.d. additive noise with known moments. Our method is based on a new way to estimate $L_2$-norms of certain expectation-functions $f_x(y)=\mathbb{E}d'(x,y)$ and use them to build robust clusters centered at points of our sample. Using a new geometric argument, we establish that, under mild geometric assumptions--bounded curvature and positive injectivity radius--these clusters allow one to recover the true distances between points in the sample up to an additive error of $O(\varepsilon \log \varepsilon^{-1})$. We develop two distinct algorithms for producing these clusters. The first achieves a sample complexity $N \asymp \varepsilon^{-2d-2}\log(1/\varepsilon)$ and runtime $o(N^3)$. The second introduces novel geometric ideas that warrant further investigation. In the presence of missing observations, we show that a quantitative lower bound on sampling probabilities suffices to modify the cluster construction in the first algorithm and extend all recovery guarantees. Our main technical result also elucidates which properties of a manifold are necessary for the distance recovery, which suggests further extension of our techniques to a broader class of metric probability spaces.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes