LG MLApr 20, 2020

A Benchmark Study on Time Series Clustering

arXiv:2004.09546v2144 citations

AI Analysis

This provides a standardized reference for researchers in time series analysis, though it is incremental as it consolidates existing methods and data.

The authors tackled the lack of a comprehensive benchmark for time series clustering by creating the first one using all datasets from the UCR archive, evaluating eight methods across three algorithm categories and three distance measures, and reporting dataset-level metrics for future research.

This paper presents the first time series clustering benchmark utilizing all time series datasets currently available in the University of California Riverside (UCR) archive -- the state of the art repository of time series data. Specifically, the benchmark examines eight popular clustering methods representing three categories of clustering algorithms (partitional, hierarchical and density-based) and three types of distance measures (Euclidean, dynamic time warping, and shape-based). We lay out six restrictions with special attention to making the benchmark as unbiased as possible. A phased evaluation approach was then designed for summarizing dataset-level assessment metrics and discussing the results. The benchmark study presented can be a useful reference for the research community on its own; and the dataset-level assessment metrics reported may be used for designing evaluation frameworks to answer different research questions.

View on arXiv PDF

Similar