LGMEMay 9, 2023

Causal Discovery from Subsampled Time Series with Proxy Variables

arXiv:2305.05276v511 citations
Originality Highly original
AI Analysis

This addresses a major barrier in causal discovery for scientific fields dealing with time series data, offering a novel solution to subsampling issues.

The paper tackles the problem of inferring causal structures from subsampled time series, where measurement frequency is lower than causal influence, by proposing a nonparametric constraint-based algorithm that achieves full causal identifiability without parametric constraints.

Inferring causal structures from time series data is the central interest of many scientific inquiries. A major barrier to such inference is the problem of subsampling, i.e., the frequency of measurement is much lower than that of causal influence. To overcome this problem, numerous methods have been proposed, yet either was limited to the linear case or failed to achieve identifiability. In this paper, we propose a constraint-based algorithm that can identify the entire causal structure from subsampled time series, without any parametric constraint. Our observation is that the challenge of subsampling arises mainly from hidden variables at the unobserved time steps. Meanwhile, every hidden variable has an observed proxy, which is essentially itself at some observable time in the future, benefiting from the temporal structure. Based on these, we can leverage the proxies to remove the bias induced by the hidden variables and hence achieve identifiability. Following this intuition, we propose a proxy-based causal discovery algorithm. Our algorithm is nonparametric and can achieve full causal identification. Theoretical advantages are reflected in synthetic and real-world experiments.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes