Working Memory Capacity of ChatGPT: An Empirical Study
This provides a benchmarking tool for AI working memory, which is incremental as it applies existing human cognitive tests to a new AI model.
The study assessed ChatGPT's working memory capacity using verbal and spatial n-back tasks, finding it has a limit similar to humans, with capacity patterns persisting across different instruction strategies.
Working memory is a critical aspect of both human intelligence and artificial intelligence, serving as a workspace for the temporary storage and manipulation of information. In this paper, we systematically assess the working memory capacity of ChatGPT, a large language model developed by OpenAI, by examining its performance in verbal and spatial n-back tasks under various conditions. Our experiments reveal that ChatGPT has a working memory capacity limit strikingly similar to that of humans. Furthermore, we investigate the impact of different instruction strategies on ChatGPT's performance and observe that the fundamental patterns of a capacity limit persist. From our empirical findings, we propose that n-back tasks may serve as tools for benchmarking the working memory capacity of large language models and hold potential for informing future efforts aimed at enhancing AI working memory.