AISep 13, 2017

A Comparison of Public Causal Search Packages on Linear, Gaussian Data with No Latent Variables

arXiv:1709.04240v216 citations
AI Analysis

This study provides a benchmark for researchers and practitioners in causal inference to choose appropriate software tools for linear, Gaussian data analysis.

The authors compared four public causal search software packages (Tetrad, BNT, pcalg, bnlearn) on their ability to recover directed acyclic graph structures from linear, Gaussian data without latent variables, evaluating accuracy and time performance across 27 combinations of variables, samples, and graph density, with results averaged over 10 runs for a total of 270 datasets.

We compare Tetrad (Java) algorithms to the other public software packages BNT (Bayes Net Toolbox, Matlab), pcalg (R), bnlearn (R) on the \vanilla" task of recovering DAG structure to the extent possible from data generated recursively from linear, Gaussian structure equation models (SEMs) with no latent variables, for random graphs, with no additional knowledge of variable order or adjacency structure, and without additional specification of intervention information. Each one of the above packages offers at least one implementation suitable to this purpose. We compare them on adjacency and orientation accuracy as well as time performance, for fixed datasets. We vary the number of variables, the number of samples, and the density of graph, for a total of 27 combinations, averaging all statistics over 10 runs, for a total of 270 datasets. All runs are carried out on the same machine and on their native platforms. An interactive visualization tool is provided for the reader who wishes to know more than can be documented explicitly in this report.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes