LG AIOct 16, 2025

From Guess2Graph: When and How Can Unreliable Experts Safely Boost Causal Discovery in Finite Samples?

Sujai Hiremath, Dominik Janzing, Philipp Faller, Patrick Blöbaum, Elke Kirschbaum, Shiva Prasad Kasiviswanathan, Kyra Gan

arXiv:2510.14488v17.11 citationsh-index: 27

Originality Highly original

AI Analysis

This addresses the challenge of unreliable expert knowledge integration in causal discovery for researchers and practitioners, offering a method that is robust to errors and improves performance incrementally.

The paper tackles the problem of poor performance in causal discovery algorithms with limited samples by proposing the Guess2Graph framework, which uses expert guesses to guide statistical tests, resulting in methods like PC-Guess and gPC-Guess that preserve correctness and show monotonic improvement with expert accuracy, with gPC-Guess achieving significantly stronger gains.

Causal discovery algorithms often perform poorly with limited samples. While integrating expert knowledge (including from LLMs) as constraints promises to improve performance, guarantees for existing methods require perfect predictions or uncertainty estimates, making them unreliable for practical use. We propose the Guess2Graph (G2G) framework, which uses expert guesses to guide the sequence of statistical tests rather than replacing them. This maintains statistical consistency while enabling performance improvements. We develop two instantiations of G2G: PC-Guess, which augments the PC algorithm, and gPC-Guess, a learning-augmented variant designed to better leverage high-quality expert input. Theoretically, both preserve correctness regardless of expert error, with gPC-Guess provably outperforming its non-augmented counterpart in finite samples when experts are "better than random." Empirically, both show monotonic improvement with expert accuracy, with gPC-Guess achieving significantly stronger gains.

View on arXiv PDF

Similar