AIOct 15, 2024

Evidence of Cognitive Deficits andDevelopmental Advances in Generative AI: A Clock Drawing Test Analysis

Isaac R. Galatzer-Levy, Jed McGiffin, David Munday, Xin Liu, Danny Karmon, Ilia Labzovsky, Rivka Moroshko, Amir Zait, Daniel McDuff

arXiv:2410.11756v14.21 citationsh-index: 5

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of assessing AI cognitive abilities for researchers and developers, though it is incremental as it applies an existing neuropsychological test to new models.

The study evaluated generative AI models on the Clock Drawing Test, revealing that most struggled with accurate time representation, showing deficits akin to mild-severe cognitive impairment, with only GPT 4 Turbo and Gemini Pro 1.5 scoring perfectly (4/4).

Generative AI's rapid advancement sparks interest in its cognitive abilities, especially given its capacity for tasks like language understanding and code generation. This study explores how several recent GenAI models perform on the Clock Drawing Test (CDT), a neuropsychological assessment of visuospatial planning and organization. While models create clock-like drawings, they struggle with accurate time representation, showing deficits similar to mild-severe cognitive impairment (Wechsler, 2009). Errors include numerical sequencing issues, incorrect clock times, and irrelevant additions, despite accurate rendering of clock features. Only GPT 4 Turbo and Gemini Pro 1.5 produced the correct time, scoring like healthy individuals (4/4). A follow-up clock-reading test revealed only Sonnet 3.5 succeeded, suggesting drawing deficits stem from difficulty with numerical concepts. These findings may reflect weaknesses in visual-spatial understanding, working memory, or calculation, highlighting strengths in learned knowledge but weaknesses in reasoning. Comparing human and machine performance is crucial for understanding AI's cognitive capabilities and guiding development toward human-like cognitive functions.

View on arXiv PDF

Similar