AI Through the Human Lens: Investigating Cognitive Theories in Machine Psychology
This research addresses the problem of understanding AI behavior for AI safety and transparency, though it is incremental in applying existing psychological theories to LLMs.
The study investigated whether Large Language Models (LLMs) exhibit human-like cognitive patterns under four psychological frameworks, finding that models produce coherent narratives, show framing bias, align with specific moral concerns, and demonstrate self-contradictions with rationalization.
We investigate whether Large Language Models (LLMs) exhibit human-like cognitive patterns under four established frameworks from psychology: Thematic Apperception Test (TAT), Framing Bias, Moral Foundations Theory (MFT), and Cognitive Dissonance. We evaluated several proprietary and open-source models using structured prompts and automated scoring. Our findings reveal that these models often produce coherent narratives, show susceptibility to positive framing, exhibit moral judgments aligned with Liberty/Oppression concerns, and demonstrate self-contradictions tempered by extensive rationalization. Such behaviors mirror human cognitive tendencies yet are shaped by their training data and alignment methods. We discuss the implications for AI transparency, ethical deployment, and future work that bridges cognitive psychology and AI safety