A Topological Sorting Criterion for Random Causal Directed Acyclic Graphs
For researchers evaluating causal discovery algorithms on synthetic data, this work highlights a structural artifact that can inflate performance, suggesting the need for more realistic benchmarks.
The paper identifies a monotonic increase of relatives along the causal order in random DAGs used for evaluating causal discovery, and shows this can be exploited for causal order recovery via sorting by estimated relatives. The authors propose time-series DAGs as an alternative and discuss implications for algorithm evaluation.
Random directed acyclic graphs (DAGs) based on imposing an order on Erdős-Rényi and scale free random graphs are widely used for evaluating causal discovery algorithms. We show that in such DAGs, the set of nodes reachable via open paths, termed relatives, increases monotonically along the causal order. We assess the prevalence of this pattern numerically, and demonstrate that it can be exploited for causal order recovery via sorting by the estimated number of relatives. We note that many simulations in the literature feature settings where this yields an excellent proxy for the causal order, and show that a strict increase of relatives along the causal order leads to a singular Markov equivalence class. We propose sampling time-series DAGs as a possible alternative and discuss implications for causal discovery algorithms and their evaluation on synthetic data.