LGAIOct 15, 2022

GFlowCausal: Generative Flow Networks for Causal Discovery

arXiv:2210.08185v215 citationsh-index: 24
AI Analysis

This addresses the challenge of scaling causal discovery methods for researchers and practitioners dealing with large datasets, though it is an incremental improvement over existing score-based approaches.

The paper tackles the problem of causal discovery from observational data by proposing GFlowCausal, a method that converts graph search into a generation problem using generative flow networks, and it shows superior performance in experiments on synthetic and real datasets, including large-scale settings.

Causal discovery aims to uncover causal structure among a set of variables. Score-based approaches mainly focus on searching for the best Directed Acyclic Graph (DAG) based on a predefined score function. However, most of them are not applicable on a large scale due to the limited searchability. Inspired by the active learning in generative flow networks, we propose a novel approach to learning a DAG from observational data called GFlowCausal. It converts the graph search problem to a generation problem, in which direct edges are added gradually. GFlowCausal aims to learn the best policy to generate high-reward DAGs by sequential actions with probabilities proportional to predefined rewards. We propose a plug-and-play module based on transitive closure to ensure efficient sampling. Theoretical analysis shows that this module could guarantee acyclicity properties effectively and the consistency between final states and fully-connected graphs. We conduct extensive experiments on both synthetic and real datasets, and results show the proposed approach to be superior and also performs well in a large-scale setting.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes