Causal Discovery from Sparse Time-Series Data Using Echo State Network
This work addresses the problem of reliable causal discovery in complex systems with missing or irregular data for fields like fault diagnosis, though it is incremental as it combines existing techniques.
The paper tackled causal discovery from sparse and non-uniformly sampled time-series data by proposing a system that fills missing data with Gaussian Process Regression and uses an Echo State Network for causal discovery. It outperformed three existing algorithms on the Tennessee Eastman chemical dataset, as shown by higher Matthews Correlation Coefficient and ROC curves.
Causal discovery between collections of time-series data can help diagnose causes of symptoms and hopefully prevent faults before they occur. However, reliable causal discovery can be very challenging, especially when the data acquisition rate varies (i.e., non-uniform data sampling), or in the presence of missing data points (e.g., sparse data sampling). To address these issues, we proposed a new system comprised of two parts, the first part fills missing data with a Gaussian Process Regression, and the second part leverages an Echo State Network, which is a type of reservoir computer (i.e., used for chaotic system modelling) for Causal discovery. We evaluate the performance of our proposed system against three other off-the-shelf causal discovery algorithms, namely, structural expectation-maximization, sub-sampled linear auto-regression absolute coefficients, and multivariate Granger Causality with vector auto-regressive using the Tennessee Eastman chemical dataset; we report on their corresponding Matthews Correlation Coefficient(MCC) and Receiver Operating Characteristic curves (ROC) and show that the proposed system outperforms existing algorithms, demonstrating the viability of our approach to discover causal relationships in a complex system with missing entries.