CGLGSTApr 12, 2020

Measuring spatial uniformity with the hypersphere chord length distribution

arXiv:2004.05692v11 citations
AI Analysis

This provides a tool for detecting uniform pointsets in data science applications, though it appears incremental as it builds on existing chord length distribution theory.

The authors tackled the problem of measuring data uniformity in high-dimensional spaces by introducing a new measure based on the isomorphism between hyperspherical chords and L2-normalized Euclidean distances, validating it in four experimental setups.

Data uniformity is a concept associated with several semantic data characteristics such as lack of features, correlation and sample bias. This article introduces a novel measure to assess data uniformity and detect uniform pointsets on high-dimensional Euclidean spaces. Spatial uniformity measure builds upon the isomorphism between hyperspherical chords and L2-normalised data Euclidean distances, which is implied by the fact that, in Euclidean spaces, L2-normalised data can be geometrically defined as points on a hypersphere. The imposed connection between the distance distribution of uniformly selected points and the hyperspherical chord length distribution is employed to quantify uniformity. More specifically,, the closed-form expression of hypersphere chord length distribution is revisited extended, before examining a few qualitative and quantitative characteristics of this distribution that can be rather straightforwardly linked to data uniformity. The experimental section includes validation in four distinct setups, thus substantiating the potential of the new uniformity measure on practical data-science applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes