CLAIFeb 2

From Sycophancy to Sensemaking: Premise Governance for Human-AI Decision Making

arXiv:2602.02378v1h-index: 1
Originality Incremental advance
AI Analysis

This addresses the risk of poor commitments in high-stakes human-AI decision-making, though it appears incremental as it builds on existing governance and control concepts.

The paper tackles the problem of LLMs becoming sycophantic in decision support by proposing a shift from answer generation to collaborative premise governance, using a discrepancy-driven control loop to detect and negotiate misalignments.

As LLMs expand from assistance to decision support, a dangerous pattern emerges: fluent agreement without calibrated judgment. Low-friction assistants can become sycophantic, baking in implicit assumptions and pushing verification costs onto experts, while outcomes arrive too late to serve as reward signals. In deep-uncertainty decisions (where objectives are contested and reversals are costly), scaling fluent agreement amplifies poor commitments faster than it builds expertise. We argue reliable human-AI partnership requires a shift from answer generation to collaborative premise governance over a knowledge substrate, negotiating only what is decision-critical. A discrepancy-driven control loop operates over this substrate: detecting conflicts, localizing misalignment via typed discrepancies (teleological, epistemic, procedural), and triggering bounded negotiation through decision slices. Commitment gating blocks action on uncommitted load-bearing premises unless overridden under logged risk; value-gated challenge allocates probing under interaction cost. Trust then attaches to auditable premises and evidence standards, not conversational fluency. We illustrate with tutoring and propose falsifiable evaluation criteria.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes