BUILD with Precision: Bottom-Up Inference of Linear DAGs
This work addresses causal discovery and DAG learning, which is important for fields like statistics and machine learning, but it is incremental as it builds on existing identifiability conditions and algorithms.
The authors tackled the problem of learning directed acyclic graph (DAG) structures from observational data under a linear Gaussian model with equal noise variances, and proposed BUILD, a deterministic algorithm that exactly reconstructs the DAG from the true precision matrix, showing favorable performance compared to state-of-the-art methods in synthetic benchmarks.
Learning the structure of directed acyclic graphs (DAGs) from observational data is a central problem in causal discovery, statistical signal processing, and machine learning. Under a linear Gaussian structural equation model (SEM) with equal noise variances, the problem is identifiable and we show that the ensemble precision matrix of the observations exhibits a distinctive structure that facilitates DAG recovery. Exploiting this property, we propose BUILD (Bottom-Up Inference of Linear DAGs), a deterministic stepwise algorithm that identifies leaf nodes and their parents, then prunes the leaves by removing incident edges to proceed to the next step, exactly reconstructing the DAG from the true precision matrix. In practice, precision matrices must be estimated from finite data, and ill-conditioning may lead to error accumulation across BUILD steps. As a mitigation strategy, we periodically re-estimate the precision matrix (with less variables as leaves are pruned), trading off runtime for enhanced robustness. Reproducible results on challenging synthetic benchmarks demonstrate that BUILD compares favorably to state-of-the-art DAG learning algorithms, while offering an explicit handle on complexity.