CLFeb 19

Projective Psychological Assessment of Large Multimodal Models Using Thematic Apperception Tests

arXiv:2602.17108v1h-index: 19
Originality Incremental advance
AI Analysis

This work provides a novel projective psychological framework for evaluating LMMs, addressing the need for non-language-based personality assessment in AI, though it is incremental in applying existing psychometric tools to new models.

This study assessed the personality traits of Large Multimodal Models (LMMs) using the Thematic Apperception Test (TAT) and found that while models understand interpersonal dynamics and self-concept well, they consistently fail to perceive and regulate aggression, with larger and more recent models outperforming smaller ones across assessment dimensions.

Thematic Apperception Test (TAT) is a psychometrically grounded, multidimensional assessment framework that systematically differentiates between cognitive-representational and affective-relational components of personality-like functioning. This test is a projective psychological framework designed to uncover unconscious aspects of personality. This study examines whether the personality traits of Large Multimodal Models (LMMs) can be assessed through non-language-based modalities, using the Social Cognition and Object Relations Scale - Global (SCORS-G). LMMs are employed in two distinct roles: as subject models (SMs), which generate stories in response to TAT images, and as evaluator models (EMs), who assess these narratives using the SCORS-G framework. Evaluators demonstrated an excellent ability to understand and analyze TAT responses. Their interpretations are highly consistent with those of human experts. Assessment results highlight that all models understand interpersonal dynamics very well and have a good grasp of the concept of self. However, they consistently fail to perceive and regulate aggression. Performance varied systematically across model families, with larger and more recent models consistently outperforming smaller and earlier ones across SCORS-G dimensions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes