Causal Effects with Unobserved Unit Types in Interacting Human-AI Systems
This provides a framework for experimentation in online platforms where humans and AI agents interact, addressing a practical challenge in emerging human-AI systems, though it is incremental as it builds on existing causal methods.
The paper tackles the problem of estimating causal effects on humans in interacting human-AI systems where unit types and interaction networks are unobserved, showing that human-specific causal effects can be consistently recovered by using subpopulations with varying human composition and treatment exposure, validated on a simulated platform with LLM agents.
We study experiments on interacting populations of humans and AI agents, where both unit types and the interaction network remain unobserved. Although causal effects propagate throughout the system, the goal is to estimate effects on humans. Examples include online platforms where human users interact alongside AI-driven accounts. We assume a human-AI prior that gives each unit a probability of being human. While humans cannot be distinguished at the unit level, the prior allows us to compute the average human composition within large subpopulations. We then model outcome dynamics through a causal message passing (CMP) framework and analyze sample-mean outcomes across subpopulations. We show that by constructing subpopulations that vary in expected human composition and treatment exposure, one can consistently recover human-specific causal effects. Our results characterize when distributional knowledge of population composition (without observing unit types or the interaction network) is sufficient for identification. We validate the approach on a simulated human-AI platform driven by behaviorally differentiated LLM agents. Together, these results provide a theoretical and practical framework for experimentation in emerging human-AI systems.