LGAIMay 25, 2022

Towards Symbolic Time Series Representation Improved by Kernel Density Estimators

arXiv:2205.12960v11 citationsh-index: 11
Originality Synthesis-oriented
AI Analysis

This work addresses a domain-specific problem for researchers and practitioners in data mining, offering an incremental improvement over existing symbolic time series representation methods.

The paper tackles the limitation of the SAX algorithm, which only works reliably for Gaussian-like time series distributions, by proposing an improved method called edwSAX that handles both Gaussian and non-Gaussian data, showing promising improvements in tasks like time series reconstruction error and Euclidean distance lower bounding.

This paper deals with symbolic time series representation. It builds up on the popular mapping technique Symbolic Aggregate approXimation algorithm (SAX), which is extensively utilized in sequence classification, pattern mining, anomaly detection, time series indexing and other data mining tasks. However, the disadvantage of this method is, that it works reliably only for time series with Gaussian-like distribution. In our previous work we have proposed an improvement of SAX, called dwSAX, which can deal with Gaussian as well as non-Gaussian data distribution. Recently we have made further progress in our solution - edwSAX. Our goal was to optimally cover the information space by means of sufficient alphabet utilization; and to satisfy lower bounding criterion as tight as possible. We describe here our approach, including evaluation on commonly employed tasks such as time series reconstruction error and Euclidean distance lower bounding with promising improvements over SAX.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes