Embracing the black box: Heading towards foundation models for causal discovery from time series data
This work addresses the gap in using end-to-end learning for causal discovery, potentially enabling foundation models for this domain, though it is incremental as it builds on existing deep learning approaches.
The paper tackles the problem of causal discovery from time series data by proposing Causal Pretraining, a supervised end-to-end learning method that maps multivariate time series to causal graphs, and finds that performance can improve with more data and model size, even with differing dynamics.
Causal discovery from time series data encompasses many existing solutions, including those based on deep learning techniques. However, these methods typically do not endorse one of the most prevalent paradigms in deep learning: End-to-end learning. To address this gap, we explore what we call Causal Pretraining. A methodology that aims to learn a direct mapping from multivariate time series to the underlying causal graphs in a supervised manner. Our empirical findings suggest that causal discovery in a supervised manner is possible, assuming that the training and test time series samples share most of their dynamics. More importantly, we found evidence that the performance of Causal Pretraining can increase with data and model size, even if the additional data do not share the same dynamics. Further, we provide examples where causal discovery for real-world data with causally pretrained neural networks is possible within limits. We argue that this hints at the possibility of a foundation model for causal discovery.