ML LGMay 2, 2018

An Evaluation of Classification and Outlier Detection Algorithms

arXiv:1805.00811v12.710 citations

Originality Synthesis-oriented

AI Analysis

This work provides practical heuristics for selecting algorithms in time-series applications, but it is incremental as it compares existing methods without introducing new ones.

The paper evaluated six fast classification and outlier detection algorithms on time-series datasets, finding that Gradient Boosting Machines generally performed best for classification, while no single algorithm dominated for outlier detection, with Gradient Boosting Machines and Random Forest being better options.

This paper evaluates algorithms for classification and outlier detection accuracies in temporal data. We focus on algorithms that train and classify rapidly and can be used for systems that need to incorporate new data regularly. Hence, we compare the accuracy of six fast algorithms using a range of well-known time-series datasets. The analyses demonstrate that the choice of algorithm is task and data specific but that we can derive heuristics for choosing. Gradient Boosting Machines are generally best for classification but there is no single winner for outlier detection though Gradient Boosting Machines (again) and Random Forest are better. Hence, we recommend running evaluations of a number of algorithms using our heuristics.

View on arXiv PDF

Similar