MLCLMEFeb 6, 2018

How to Make Causal Inferences Using Texts

arXiv:1802.02163v1178 citations
Originality Incremental advance
AI Analysis

This provides a rigorous foundation for social science researchers using text data for causal analysis, though it is incremental in refining existing text-as-data techniques.

The paper tackles the challenge of making causal inferences using discovered textual measures by introducing a conceptual framework that addresses risks like identification problems and overfitting, applying it to experiments on immigration attitudes and bureaucratic response.

New text as data techniques offer a great promise: the ability to inductively discover measures that are useful for testing social science theories of interest from large collections of text. We introduce a conceptual framework for making causal inferences with discovered measures as a treatment or outcome. Our framework enables researchers to discover high-dimensional textual interventions and estimate the ways that observed treatments affect text-based outcomes. We argue that nearly all text-based causal inferences depend upon a latent representation of the text and we provide a framework to learn the latent representation. But estimating this latent representation, we show, creates new risks: we may introduce an identification problem or overfit. To address these risks we describe a split-sample framework and apply it to estimate causal effects from an experiment on immigration attitudes and a study on bureaucratic response. Our work provides a rigorous foundation for text-based causal inferences.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes