GPT-ology, Computational Models, Silicon Sampling: How should we think about LLMs in Cognitive Science?
This work addresses methodological challenges for cognitive scientists using LLMs, but it is incremental as it reviews existing paradigms without introducing new methods.
The paper reviews emerging research paradigms for using Large Language Models in cognitive science, discussing their claims and challenges to scientific inference, and highlights outstanding issues such as closed-source models and reproducibility.
Large Language Models have taken the cognitive science world by storm. It is perhaps timely now to take stock of the various research paradigms that have been used to make scientific inferences about ``cognition" in these models or about human cognition. We review several emerging research paradigms -- GPT-ology, LLMs-as-computational-models, and ``silicon sampling" -- and review recent papers that have used LLMs under these paradigms. In doing so, we discuss their claims as well as challenges to scientific inference under these various paradigms. We highlight several outstanding issues about LLMs that have to be addressed to push our science forward: closed-source vs open-sourced models; (the lack of visibility of) training data; and reproducibility in LLM research, including forming conventions on new task ``hyperparameters" like instructions and prompts.