LGMLAug 29, 2024

Large-Scale Targeted Cause Discovery via Learning from Simulated Data

arXiv:2408.16218v35 citationsh-index: 5Has Code
Originality Incremental advance
AI Analysis

This addresses the computational challenge of causal discovery in large-scale systems, such as gene regulatory networks, enabling efficient regulation through intervention, though it is incremental as it builds on supervised learning with simulated data.

The paper tackles the problem of inferring causal variables for a target variable from observations, focusing on directly identifying a set of causal factors without full graph reconstruction, and demonstrates superior performance in large-scale gene regulatory networks, outperforming existing methods.

We propose a novel machine learning approach for inferring causal variables of a target variable from observations. Our focus is on directly inferring a set of causal factors without requiring full causal graph reconstruction, which is computationally challenging in large-scale systems. The identified causal set consists of all potential regulators of the target variable under experimental settings, enabling efficient regulation through intervention. To achieve this, we train a neural network using supervised learning on simulated data to infer causality. By employing a subsampled-ensemble inference strategy, our approach scales with linear complexity in the number of variables, efficiently scaling up to thousands of variables. Empirical results demonstrate superior performance in identifying causal relationships within large-scale gene regulatory networks, outperforming existing methods that emphasize full-graph discovery. We validate our model's generalization capability across out-of-distribution graph structures and generating mechanisms, including gene regulatory networks of E. coli and the human K562 cell line. Implementation codes are available at https://github.com/snu-mllab/Targeted-Cause-Discovery.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes