The Landscape of Causal Discovery Data: Grounding Causal Discovery in Real-World Applications
This work addresses the limited real-world applications of causal discovery by identifying gaps in evaluation practices, which is incremental as it synthesizes existing knowledge rather than proposing new methods.
The paper systematically reviews causal discovery literature, highlighting that current methods rely on unrealistic assumptions and are evaluated on simple synthetic datasets, and it presents applications in biology, neuroscience, and Earth sciences to encourage better evaluation practices.
Causal discovery aims to automatically uncover causal relationships from data, a capability with significant potential across many scientific disciplines. However, its real-world applications remain limited. Current methods often rely on unrealistic assumptions and are evaluated only on simple synthetic toy datasets, often with inadequate evaluation metrics. In this paper, we substantiate these claims by performing a systematic review of the recent causal discovery literature. We present applications in biology, neuroscience, and Earth sciences - fields where causal discovery holds promise for addressing key challenges. We highlight available simulated and real-world datasets from these domains and discuss common assumption violations that have spurred the development of new methods. Our goal is to encourage the community to adopt better evaluation practices by utilizing realistic datasets and more adequate metrics.