SEAIApr 8

Breaking the Illusion of Identity in LLM Tooling

arXiv:2604.0739840.4
AI Analysis

This addresses the issue of trust calibration in LLM tooling for researchers and developers, though it is incremental as it builds on existing mitigation efforts.

The paper tackled the problem of LLMs producing outputs that create an illusion of agency, degrading verification and trust, by proposing seven output-side rules that reduced anthropomorphic markers by over 97% and shortened outputs by 49% in empirical tests.

Large language models (LLMs) in research and development toolchains produce output that triggers attribution of agency and understanding -- a cognitive illusion that degrades verification behavior and trust calibration. No existing mitigation provides a systematic, deployable constraint set for output register. This paper proposes seven output-side rules, each targeting a documented linguistic mechanism, and validates them empirically. In 780 two-turn conversations (constrained vs. default register, 30 tasks, 13 replicates, 1560 API calls), anthropomorphic markers dropped from 1233 to 33 (>97% reduction, p < 0.001), outputs were 49% shorter by word count, and adapted AnthroScore confirmed the shift toward machine register (-1.94 vs. -0.96, p < 0.001). The rules are implemented as a configuration-file system prompt requiring no model modification; validation uses a single model (Claude Sonnet 4). Output quality under the constrained register was not evaluated. The mechanism is extensible to other domains.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes