GeoStat Representations of Time Series for Fast Classification
This work addresses the need for fast and efficient time series classification methods, offering a computationally simpler alternative to deep learning approaches, though it is incremental as it builds on existing trajectory classification techniques.
The paper tackles the problem of computational complexity in time series classification by introducing GeoStat representations, which summarize time series using differential geometric statistics and achieve state-of-the-art results with simple classifiers like KNN and SVM on real datasets, including a challenging fishing vessel classification task where it performs well with only 2% of the data used by prior methods.
Recent advances in time series classification have largely focused on methods that either employ deep learning or utilize other machine learning models for feature extraction. Though successful, their power often comes at the requirement of computational complexity. In this paper, we introduce GeoStat representations for time series. GeoStat representations are based off of a generalization of recent methods for trajectory classification, and summarize the information of a time series in terms of comprehensive statistics of (possibly windowed) distributions of easy to compute differential geometric quantities, requiring no dynamic time warping. The features used are intuitive and require minimal parameter tuning. We perform an exhaustive evaluation of GeoStat on a number of real datasets, showing that simple KNN and SVM classifiers trained on these representations exhibit surprising performance relative to modern single model methods requiring significant computational power, achieving state of the art results in many cases. In particular, we show that this methodology achieves good performance on a challenging dataset involving the classification of fishing vessels, where our methods achieve good performance relative to the state of the art despite only having access to approximately two percent of the dataset used in training and evaluating this state of the art.