AIJul 18, 2024

Sortability of Time Series Data

arXiv:2407.13313v33 citationsh-index: 2
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of evaluating causal discovery algorithms for time series data, revealing potential biases in performance metrics, but it is incremental as it extends existing sortability concepts to a new domain.

The paper demonstrates that dataset characteristics like varsortability and R^2-sortability, previously studied in non-time-series contexts, also occur in autocorrelated stationary time series data, and finds that real-world datasets show high varsortability and low R^2-sortability, suggesting scales may carry causal information.

Evaluating the performance of causal discovery algorithms that aim to find causal relationships between time-dependent processes remains a challenging topic. In this paper, we show that certain characteristics of datasets, such as varsortability (Reisach et al. 2021) and $R^2$-sortability (Reisach et al. 2023), also occur in datasets for autocorrelated stationary time series. We illustrate this empirically using four types of data: simulated data based on SVAR models and Erdős-Rényi graphs, the data used in the 2019 causality-for-climate challenge (Runge et al. 2019), real-world river stream datasets, and real-world data generated by the Causal Chamber of (Gamella et al. 2024). To do this, we adapt var- and $R^2$-sortability to time series data. We also investigate the extent to which the performance of score-based causal discovery methods goes hand in hand with high sortability. Arguably, our most surprising finding is that the investigated real-world datasets exhibit high varsortability and low $R^2$-sortability indicating that scales may carry a significant amount of causal information.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes