LGMLMar 21, 2018

Clustering to Reduce Spatial Data Set Size

arXiv:1803.08101v133 citations
Originality Synthesis-oriented
AI Analysis

This addresses data overload for researchers in spatial analysis, but it is incremental as it applies an existing method to a new context.

The paper tackles the problem of excessive spatial data by using density-based clustering to compress spatially redundant points into representative features, achieving data size reduction.

Traditionally it had been a problem that researchers did not have access to enough spatial data to answer pressing research questions or build compelling visualizations. Today, however, the problem is often that we have too much data. Spatially redundant or approximately redundant points may refer to a single feature (plus noise) rather than many distinct spatial features. We use a machine learning approach with density-based clustering to compress such spatial data into a set of representative features.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes