LG MLJun 25, 2025

On the ability of Deep Neural Networks to Learn Granger Causality in Multi-Variate Time Series Data

arXiv:2506.20347v14.1h-index: 42

Originality Highly original

AI Analysis

This addresses the challenge of capturing complex associations in time series data for researchers and practitioners, offering a more flexible alternative to linear models.

The paper tackles the problem of learning Granger causality in multivariate time series data by proposing a novel paradigm that treats it as a prediction task using deep neural networks, showing that a well-regularized model can learn the true causal structure without explicit variable selection terms.

Granger Causality (GC) offers an elegant statistical framework to study the association between multivariate time series data. Linear Vector Autoregressive models (VAR) though have nice interpretation properties but have limited practical application due to underlying assumptions on the kind of associations that can be captured by these models. Numerous attempts have already been made in the literature that exploit the functional approximation power of Deep Neural Networks (DNNs) for the task of GC estimation. These methods however treat GC as a variable selection problem. We present a novel paradigm for approaching GC. We present this idea that GC is essentially linked with prediction and if a deep learning model is used to model the time series collectively or jointly, a well regularized model may learn the true granger causal structure from the data, given that there is enough training data. We propose to uncover the learned GC structure by comparing the model uncertainty or distribution of the residuals when the past of everything is used as compared to the one where a specific time series component is dropped from the model. We also compare the effect of input layer dropout on the ability of a neural network to learn granger causality from the data. We show that a well regularized model infact can learn the true GC structure from the data without explicitly adding terms in the loss function that guide the model to select variables or perform sparse regression.

View on arXiv PDF

Similar