Interpretable, multi-dimensional Evaluation Framework for Causal Discovery from observational i.i.d. Data
This work addresses the need for a unified evaluation framework in causal discovery, which is incremental but important for researchers in machine learning and statistics.
The authors tackled the problem of evaluating causal discovery methods under violated identifiability assumptions by introducing a six-dimensional metric called distance to optimal solution (DOS), and found that amortized causal discovery methods perform comparably well to causal order-based methods in large-scale simulations.
Nonlinear causal discovery from observational data imposes strict identifiability assumptions on the formulation of structural equations utilized in the data generating process. The evaluation of structure learning methods under assumption violations requires a rigorous and interpretable approach, which quantifies both the structural similarity of the estimation with the ground truth and the capacity of the discovered graphs to be used for causal inference. Motivated by the lack of unified performance assessment framework, we introduce an interpretable, six-dimensional evaluation metric, i.e., distance to optimal solution (DOS), which is specifically tailored to the field of causal discovery. Furthermore, this is the first research to assess the performance of structure learning algorithms from seven different families on increasing percentage of non-identifiable, nonlinear causal patterns, inspired by real-world processes. Our large-scale simulation study, which incorporates seven experimental factors, shows that besides causal order-based methods, amortized causal discovery delivers results with comparatively high proximity to the optimal solution.