LGMar 20, 2025

Line Space Clustering (LSC): Feature-Based Clustering using K-medians and Dynamic Time Warping for Versatility

arXiv:2503.15777v1h-index: 1Has Code
Originality Synthesis-oriented
AI Analysis

This addresses the challenge of clustering high-dimensional data for machine learning practitioners, but it appears incremental as it combines existing methods like K-medians and DTW.

The paper tackles clustering high-dimensional data by proposing Line Space Clustering (LSC), which transforms data into lines and uses a combined Euclidean and Dynamic Time Warping distance metric, showing efficacy on synthetic and real-world datasets in noisy environments.

Clustering high-dimensional data is a critical challenge in machine learning due to the curse of dimensionality and the presence of noise. Traditional clustering algorithms often fail to capture the intrinsic structures in such data. This paper explores a combination of clustering methods, which we called Line Space Clustering (LSC), a representation that transforms data points into lines in a newly defined feature space, enabling clustering based on the similarity of feature value patterns, essentially treating features as sequences. LSC employs a combined distance metric that uses Euclidean and Dynamic Time Warping (DTW) distances, weighted by a parameter α, allowing flexibility in emphasizing shape or magnitude similarities. We delve deeply into the mechanics of DTW and the Savitzky Golay filter, explaining their roles in the algorithm. Extensive experiments demonstrate the efficacy of LSC on synthetic and real-world datasets, showing that randomly experimenting with time-series optimized methods sometimes might surprisingly work on a complex dataset, particularly in noisy environments. Source code and experiments are available at: https://github.com/JoanikijChulev/LSC.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes