LGFeb 22, 2017

Ensembles of Randomized Time Series Shapelets Provide Improved Accuracy while Reducing Computational Costs

arXiv:1702.06712v13 citations
Originality Incremental advance
AI Analysis

This work addresses a computational bottleneck for researchers and practitioners using shapelet-based time series classification, offering a more efficient solution with better performance.

The paper tackles the high computational cost of shapelet discovery in time series classification by proposing ensembles of classifiers based on randomly sampled shapelets, achieving improved accuracy and significantly reduced computational costs compared to the exact method.

Shapelets are discriminative time series subsequences that allow generation of interpretable classification models, which provide faster and generally better classification than the nearest neighbor approach. However, the shapelet discovery process requires the evaluation of all possible subsequences of all time series in the training set, making it extremely computation intensive. Consequently, shapelet discovery for large time series datasets quickly becomes intractable. A number of improvements have been proposed to reduce the training time. These techniques use approximation or discretization and often lead to reduced classification accuracy compared to the exact method. We are proposing the use of ensembles of shapelet-based classifiers obtained using random sampling of the shapelet candidates. Using random sampling reduces the number of evaluated candidates and consequently the required computational cost, while the classification accuracy of the resulting models is also not significantly different than that of the exact algorithm. The combination of randomized classifiers rectifies the inaccuracies of individual models because of the diversity of the solutions. Based on the experiments performed, it is shown that the proposed approach of using an ensemble of inexpensive classifiers provides better classification accuracy compared to the exact method at a significantly lesser computational cost.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes