LG MLNov 27, 2019

A tale of two toolkits, report the second: bake off redux. Chapter 1. dictionary based classifiers

Anthony Bagnall, James Large, Matthew Middlehurst

arXiv:1911.12008v11.81 citations

Originality Synthesis-oriented

AI Analysis

This work provides an incremental update to time series classification benchmarks, aiding researchers in selecting algorithms based on performance and resource trade-offs.

The study compared four dictionary-based time series classification algorithms to update a 2017 benchmark, finding that accuracy improvements are possible but at higher computational costs, or similar performance can be achieved more efficiently.

Time series classification (TSC) is the problem of learning labels from time dependent data. One class of algorithms is derived from a bag of words approach. A window is run along a series, the subseries is shortened and discretised to form a word, then features are formed from the histogram of frequency of occurrence of words. We call this type of approach to TSC dictionary based classification. We compare four dictionary based algorithms in the context of a wider project to update the great time series classification bakeoff, a comparative study published in 2017. We experimentally characterise the algorithms in terms of predictive performance, time complexity and space complexity. We find that we can improve on the previous best in terms of accuracy, but this comes at the cost of time and space. Alternatively, the same performance can be achieved with far less cost. We review the relative merits of the four algorithms before suggesting a path to possible improvement.

View on arXiv PDF

Similar