Reducing statistical time-series problems to binary classification
This work addresses the challenge of using i.i.d. data methods for dependent time-series problems, offering a novel approach for researchers in statistics and machine learning.
The paper tackles the problem of applying binary classification methods to time-series tasks like clustering, homogeneity testing, and the three-sample problem by introducing a new metric between distributions, proving universal consistency and demonstrating results with experiments.
We show how binary classification methods developed to work on i.i.d. data can be used for solving statistical problems that are seemingly unrelated to classification and concern highly-dependent time series. Specifically, the problems of time-series clustering, homogeneity testing and the three-sample problem are addressed. The algorithms that we construct for solving these problems are based on a new metric between time-series distributions, which can be evaluated using binary classification methods. Universal consistency of the proposed algorithms is proven under most general assumptions. The theoretical results are illustrated with experiments on synthetic and real-world data.