Ambiguity Collapse by LLMs: A Taxonomy of Epistemic Risks
This paper addresses the critical problem of LLMs mismanaging ambiguous concepts, which is crucial for anyone deploying LLMs in sensitive, value-laden domains like content moderation or hiring, potentially leading to distorted conceptual understanding and foreclosed deliberation.
This paper introduces "ambiguity collapse," a phenomenon where large language models (LLMs) provide a singular interpretation for genuinely ambiguous terms, bypassing human negotiation processes. It develops a taxonomy of epistemic risks at process, output, and ecosystem levels, illustrating them with case studies.
Large language models (LLMs) are increasingly used to make sense of ambiguous, open-textured, value-laden terms. Platforms routinely rely on LLMs for content moderation, asking them to label text based on disputed concepts like "hate speech" or "incitement"; hiring managers may use LLMs to rank who counts as "qualified"; and AI labs increasingly train models to self-regulate under constitutional-style ambiguous principles such as "biased" or "legitimate". This paper introduces ambiguity collapse: a phenomenon that occurs when an LLM encounters a term that genuinely admits multiple legitimate interpretations, yet produces a singular resolution, in ways that bypass the human practices through which meaning is ordinarily negotiated, contested, and justified. Drawing on interdisciplinary accounts of ambiguity as a productive epistemic resource, we develop a taxonomy of the epistemic risks posed by ambiguity collapse at three levels: process (foreclosing opportunities to deliberate, develop cognitive skills, and shape contested terms), output (distorting the concepts and reasons agents act upon), and ecosystem (reshaping shared vocabularies, interpretive norms, and how concepts evolve over time). We illustrate these risks through three case studies, and conclude by sketching multi-layer mitigation principles spanning training, institutional deployment design, interface affordances, and the management of underspecified prompts, with the goal of designing systems that surface, preserve, and responsibly govern ambiguity.