Extreme-SAX: Extreme Points Based Symbolic Representation for Time Series Classification
This work addresses a domain-specific issue in data mining for time series analysis, offering an incremental improvement over existing dimensionality reduction techniques.
The paper tackles the problem of time series classification by proposing Extreme-SAX (E-SAX), a symbolic representation method that uses extreme points per segment, which improves classification accuracy over the original SAX method as demonstrated in experiments on various datasets.
Time series classification is an important problem in data mining with several applications in different domains. Because time series data are usually high dimensional, dimensionality reduction techniques have been proposed as an efficient approach to lower their dimensionality. One of the most popular dimensionality reduction techniques of time series data is the Symbolic Aggregate Approximation (SAX), which is inspired by algorithms from text mining and bioinformatics. SAX is simple and efficient because it uses precomputed distances. The disadvantage of SAX is its inability to accurately represent important points in the time series. In this paper we present Extreme-SAX (E-SAX), which uses only the extreme points of each segment to represent the time series. E-SAX has exactly the same simplicity and efficiency of the original SAX, yet it gives better results in time series classification than the original SAX, as we show in extensive experiments on a variety of time series datasets.