CLMay 28

ActTraitBench: Quantifying the Knowledge-Decision Gap in Large Language Models via Human-Grounded Behavioral Validation

Yutong Yang, Chenxi Miao, Weikang Li, Yunfang Wu

arXiv:2605.2979181.3

Predicted impact top 77% in CL · last 90 daysOriginality Incremental advance

AI Analysis

For LLM evaluation and alignment, this work provides a human-grounded benchmark to measure personality consistency, revealing a systematic gap between self-report and behavior.

ActTraitBench quantifies the Knowledge-Decision Gap in LLMs, showing that larger models exhibit stronger behavioral divergence despite consistent self-reports. The Chain of Cognitive Alignment intervention improves alignment in reasoning-capable models but not smaller ones.

While Large Language Models (LLMs) can convincingly simulate personas in explicit self-reports, they often deviate in implicit behavioral decisions, revealing a substantial Knowledge-Decision Gap ($G_{\text{KD}}$). Existing benchmarks struggle to measure this asymmetry due to limited construct validity, multi-dimensional entanglement, and distributional biases in LLM-based evaluation. To address these issues, we propose ActTraitBench, a human-grounded evaluation framework for measuring personality consistency in LLMs. Grounded in empirical human data, ActTraitBench establishes one-to-one mappings between psychometric facets and behavioral paradigms, and applies a Distributional Calibration via Quantile Mapping procedure to align LLM-judge score distributions with human norms. Experiments on 14 mainstream LLMs reveal a pervasive knowledge-decision asymmetry, where larger and more capable models often exhibit stronger behavioral divergence despite highly consistent self-reports. To mitigate this gap, we further introduce the Chain of Cognitive Alignment (CoCA), a plug-and-play inference-time intervention that improves alignment in reasoning-capable frontier models while exposing clear capability limitations in smaller architectures.

View on arXiv PDF

Similar