LG SPFeb 8, 2023

ASTRIDE: Adaptive Symbolization for Time Series Databases

Sylvain W. Combettes, Charles Truong, Laurent Oudre

arXiv:2302.04097v12.0h-index: 21Has Code

Originality Incremental advance

AI Analysis

This work addresses the need for more effective symbolic representations in time series databases, offering incremental improvements over existing methods like SAX and SFA.

The authors tackled the problem of representing time series data symbolically by introducing ASTRIDE and FASTRIDE, which adaptively segment and quantize signals using a shared dictionary, and demonstrated their performance on reconstruction and classification tasks across 86 datasets from the UCR archive.

We introduce ASTRIDE (Adaptive Symbolization for Time seRIes DatabasEs), a novel symbolic representation of time series, along with its accelerated variant FASTRIDE (Fast ASTRIDE). Unlike most symbolization procedures, ASTRIDE is adaptive during both the segmentation step by performing change-point detection and the quantization step by using quantiles. Instead of proceeding signal by signal, ASTRIDE builds a dictionary of symbols that is common to all signals in a data set. We also introduce D-GED (Dynamic General Edit Distance), a novel similarity measure on symbolic representations based on the general edit distance. We demonstrate the performance of the ASTRIDE and FASTRIDE representations compared to SAX (Symbolic Aggregate approXimation), 1d-SAX, SFA (Symbolic Fourier Approximation), and ABBA (Adaptive Brownian Bridge-based Aggregation) on reconstruction and, when applicable, on classification tasks. These algorithms are evaluated on 86 univariate equal-size data sets from the UCR Time Series Classification Archive. An open source GitHub repository called astride is made available to reproduce all the experiments in Python.

View on arXiv PDF Code

Similar