MLLGApr 16, 2021

Data Generating Process to Evaluate Causal Discovery Techniques for Time Series Data

arXiv:2104.08043v125 citations
AI Analysis

This work addresses the need for better benchmarks in causal discovery for time series, benefiting researchers and practitioners by enabling more robust method evaluation, though it is incremental as it builds on existing evaluation challenges.

The authors tackled the problem of evaluating causal discovery methods for time series data by proposing a flexible framework for generating synthetic data, which revealed a notable degradation in performance when assumptions were violated and highlighted sensitivity to hyperparameters.

Going beyond correlations, the understanding and identification of causal relationships in observational time series, an important subfield of Causal Discovery, poses a major challenge. The lack of access to a well-defined ground truth for real-world data creates the need to rely on synthetic data for the evaluation of these methods. Existing benchmarks are limited in their scope, as they either are restricted to a "static" selection of data sets, or do not allow for a granular assessment of the methods' performance when commonly made assumptions are violated. We propose a flexible and simple to use framework for generating time series data, which is aimed at developing, evaluating, and benchmarking time series causal discovery methods. In particular, the framework can be used to fine tune novel methods on vast amounts of data, without "overfitting" them to a benchmark, but rather so they perform well in real-world use cases. Using our framework, we evaluate prominent time series causal discovery methods and demonstrate a notable degradation in performance when their assumptions are invalidated and their sensitivity to choice of hyperparameters. Finally, we propose future research directions and how our framework can support both researchers and practitioners.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes