Nearest-Neighbor Radii under Dependent Sampling
For machine learning practitioners using nearest-neighbor methods on dependent data (e.g., time series), this work provides theoretical reassurance that geometric properties remain valid under dependence.
This paper proves that nearest-neighbor radii under dependent (mixing) sampling converge at the same rate as under independent sampling, with bounds depending on local intrinsic dimension rather than ambient dimension, supported by synthetic and real-world time-series experiments.
Nearest-neighbor methods are fundamental to classical and modern machine learning, yet their geometric properties are typically analyzed under independent sampling. In this paper, we study the nearest-neighbor radii under dependent sampling. We consider strong mixing dependent observations and ask whether dependence changes the scale of nearest-neighbor neighborhoods. We establish distribution-free almost sure convergence under polynomial mixing and sharp non-asymptotic moment bounds under geometric mixing. The moment bounds depend on the local intrinsic dimension rather than the ambient dimension, making the results applicable to high-dimensional data concentrated near lower-dimensional manifolds. Synthetic experiments and real-world time-series benchmarks support the theory, showing that nearest-neighbor geometry remains informative under dependence sampling.