MLApr 28, 2016

Optimal Transport vs. Fisher-Rao distance between Copulas for Clustering Multivariate Time Series

arXiv:1604.08634v216 citations
Originality Synthesis-oriented
AI Analysis

This work addresses clustering challenges for multivariate time series data, particularly in finance, but appears incremental as it compares existing distance methods.

The paper tackles the problem of clustering multivariate time series by comparing distances between copulas, specifically Fisher-Rao and optimal transport distances, to leverage dependence structures, with applications demonstrated in financial asset clustering.

We present a methodology for clustering N objects which are described by multivariate time series, i.e. several sequences of real-valued random variables. This clustering methodology leverages copulas which are distributions encoding the dependence structure between several random variables. To take fully into account the dependence information while clustering, we need a distance between copulas. In this work, we compare renowned distances between distributions: the Fisher-Rao geodesic distance, related divergences and optimal transport, and discuss their advantages and disadvantages. Applications of such methodology can be found in the clustering of financial assets. A tutorial, experiments and implementation for reproducible research can be found at www.datagrapple.com/Tech.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes