AIApr 23, 2021

Normalized multivariate time series causality analysis and causal graph reconstruction

arXiv:2104.11360v183 citations
Originality Incremental advance
AI Analysis

This work addresses causality analysis for data science and machine learning, introducing a long-overdue generalization that is timely given community interest, though it appears incremental as it builds on prior theoretical advances.

The authors tackled the problem of causality analysis in multivariate time series by generalizing an information flow-based approach from bivariate to multivariate cases, resulting in a transparent and computationally efficient algorithm that successfully reconstructed causal graphs even in extreme noise and near-synchronization scenarios, accurately differentiating confounding processes.

Causality analysis is an important problem lying at the heart of science, and is of particular importance in data science and machine learning. An endeavor during the past 16 years viewing causality as real physical notion so as to formulate it from first principles, however, seems to go unnoticed. This study introduces to the community this line of work, with a long-due generalization of the information flow-based bivariate time series causal inference to multivariate series, based on the recent advance in theoretical development. The resulting formula is transparent, and can be implemented as a computationally very efficient algorithm for application. It can be normalized, and tested for statistical significance. Different from the previous work along this line where only information flows are estimated, here an algorithm is also implemented to quantify the influence of a unit to itself. While this forms a challenge in some causal inferences, here it comes naturally, and hence the identification of self-loops in a causal graph is fulfilled automatically as the causalities along edges are inferred. To demonstrate the power of the approach, presented here are two applications in extreme situations. The first is a network of multivariate processes buried in heavy noises (with the noise-to-signal ratio exceeding 100), and the second a network with nearly synchronized chaotic oscillators. In both graphs, confounding processes exist. While it seems to be a huge challenge to reconstruct from given series these causal graphs, an easy application of the algorithm immediately reveals the desideratum. Particularly, the confounding processes have been accurately differentiated. Considering the surge of interest in the community, this study is very timely.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes