AIApr 6

IV Co-Scientist: Multi-Agent LLM Framework for Causal Instrumental Variable Discovery

Ivaxi Sheth, Zhijing Jin, Bryan Wilder, Dominik Janzing, Mario Fritz

arXiv:2602.0794359.71 citationsh-index: 7

AI Analysis

This addresses the challenge of causal inference for researchers in fields like economics or social sciences, but it is incremental as it builds on existing LLM capabilities for a specific task.

The paper tackles the problem of identifying valid instrumental variables for causal inference in the presence of confounding, by introducing IV Co-Scientist, a multi-agent LLM framework that proposes, critiques, and refines IVs, showing potential for discovering valid instruments from observational data.

In the presence of confounding between an endogenous variable and the outcome, instrumental variables (IVs) are used to isolate the causal effect of the endogenous variable. Identifying valid instruments requires interdisciplinary knowledge, creativity, and contextual understanding, making it a non-trivial task. In this paper, we investigate whether large language models (LLMs) can aid in this task. We perform a two-stage evaluation framework. First, we test whether LLMs can recover well-established instruments from the literature, assessing their ability to replicate standard reasoning. Second, we evaluate whether LLMs can identify and avoid instruments that have been empirically or theoretically discredited. Building on these results, we introduce IV Co-Scientist, a multi-agent system that proposes, critiques, and refines IVs for a given treatment-outcome pair. We also introduce a statistical test to contextualize consistency in the absence of ground truth. Our results show the potential of LLMs to discover valid instrumental variables from a large observational database.

View on arXiv PDF

Similar