Imitation versus Innovation: What children can do that large language and language-and-vision models cannot (yet)?
This work addresses the problem of understanding the limitations of AI models in achieving human-like innovation, which is important for researchers in AI and cognitive science, though it is incremental as it builds on existing comparisons between AI and human cognition.
The paper compares the abilities of large language and language-and-vision models to human children in designing new tools and discovering novel causal structures, finding that these models act as efficient imitation engines but struggle with innovation, suggesting they may require more than just large-scale data to match child-like capabilities.
Much discussion about large language models and language-and-vision models has focused on whether these models are intelligent agents. We present an alternative perspective. We argue that these artificial intelligence models are cultural technologies that enhance cultural transmission in the modern world, and are efficient imitation engines. We explore what AI models can tell us about imitation and innovation by evaluating their capacity to design new tools and discover novel causal structures, and contrast their responses with those of human children. Our work serves as a first step in determining which particular representations and competences, as well as which kinds of knowledge or skill, can be derived from particular learning techniques and data. Critically, our findings suggest that machines may need more than large scale language and images to achieve what a child can do.