CL AI LGMar 30, 2024

Causal Inference for Human-Language Model Collaboration

Bohan Zhang, Yixin Wang, Paramveer S. Dhillon

arXiv:2404.00207v115.230 citationsh-index: 16Has CodeNAACL

Originality Incremental advance

AI Analysis

This work addresses the challenge of optimizing human-AI collaboration strategies for users interacting with language models, though it is incremental in advancing causal inference methods for text data.

The paper tackles the problem of estimating causal effects of text-based interaction strategies in human-language model collaborations by introducing a new causal estimand called Incremental Stylistic Effect (ISE) to handle high-dimensional treatments, and develops the CausalCollab algorithm, which empirically reduces confounding and improves counterfactual estimation over baselines in three scenarios.

In this paper, we examine the collaborative dynamics between humans and language models (LMs), where the interactions typically involve LMs proposing text segments and humans editing or responding to these proposals. Productive engagement with LMs in such scenarios necessitates that humans discern effective text-based interaction strategies, such as editing and response styles, from historical human-LM interactions. This objective is inherently causal, driven by the counterfactual `what-if' question: how would the outcome of collaboration change if humans employed a different text editing/refinement strategy? A key challenge in answering this causal inference question is formulating an appropriate causal estimand: the conventional average treatment effect (ATE) estimand is inapplicable to text-based treatments due to their high dimensionality. To address this concern, we introduce a new causal estimand -- Incremental Stylistic Effect (ISE) -- which characterizes the average impact of infinitesimally shifting a text towards a specific style, such as increasing formality. We establish the conditions for the non-parametric identification of ISE. Building on this, we develop CausalCollab, an algorithm designed to estimate the ISE of various interaction strategies in dynamic human-LM collaborations. Our empirical investigations across three distinct human-LM collaboration scenarios reveal that CausalCollab effectively reduces confounding and significantly improves counterfactual estimation over a set of competitive baselines.

View on arXiv PDF Code

Similar