Evaluating Causal Discovery Algorithms for Path-Specific Fairness and Utility in Healthcare
This work addresses the problem of deploying causal discovery for fairness and utility in clinical applications, but it is incremental as it focuses on evaluation methods rather than new algorithms.
The paper tackled the challenge of evaluating causal discovery algorithms in healthcare by constructing proxy ground-truth graphs with expert collaboration and benchmarking them on synthetic Alzheimer's disease and heart failure data, finding that Peter-Clark achieved the best structural recovery on synthetic data and Fast Causal Inference had the highest utility on heart failure data, with path-specific effects like ejection fraction contributing 3.37 percentage points to indirect effects.
Causal discovery in health data faces evaluation challenges when ground truth is unknown. We address this by collaborating with experts to construct proxy ground-truth graphs, establishing benchmarks for synthetic Alzheimer's disease and heart failure clinical records data. We evaluate the Peter-Clark, Greedy Equivalence Search, and Fast Causal Inference algorithms on structural recovery and path-specific fairness decomposition, going beyond composite fairness scores. On synthetic data, Peter-Clark achieved the best structural recovery. On heart failure data, Fast Causal Inference achieved the highest utility. For path-specific effects, ejection fraction contributed 3.37 percentage points to the indirect effect in the ground truth. These differences drove variations in the fairness-utility ratio across algorithms. Our results highlight the need for graph-aware fairness evaluation and fine-grained path-specific analysis when deploying causal discovery in clinical applications.