When Models Know More Than They Say: Probing Analogical Reasoning in LLMs
Reveals a task-dependent gap between internal knowledge and prompted behavior in LLMs, highlighting limitations in how prompting accesses available information.
LLMs struggle with analogical reasoning when analogies require latent information, not just surface cues. Probing internal representations outperforms prompting for rhetorical analogies in open-source models, but both perform poorly for narrative analogies.
Analogical reasoning is a core cognitive faculty essential for narrative understanding. While LLMs perform well when surface and structural cues align, they struggle in cases where an analogy is not apparent on the surface but requires latent information, suggesting limitations in abstraction and generalisation. In this paper we compare a model's probed representations with its prompted performance at detecting narrative analogies, revealing an asymmetry: for rhetorical analogies, probing significantly outperforms prompting in open-source models, while for narrative analogies, they achieve a similar (low) performance. This suggests that the relationship between internal representations and prompted behavior is task-dependent and may reflect limitations in how prompting accesses available information.