Probing the Creativity of Large Language Models: Can models produce divergent semantic association?
This addresses the problem of assessing creativity in AI for researchers and developers, providing incremental insights into model capabilities.
The study investigated whether large language models can generate creative content by using the divergent association task to measure semantic distance between unrelated words. Results showed that GPT-4 outperformed 96% of humans with greedy search, and stochastic sampling improved scores for other models but involved a creativity-stability trade-off.
Large language models possess remarkable capacity for processing language, but it remains unclear whether these models can further generate creative content. The present study aims to investigate the creative thinking of large language models through a cognitive perspective. We utilize the divergent association task (DAT), an objective measurement of creativity that asks models to generate unrelated words and calculates the semantic distance between them. We compare the results across different models and decoding strategies. Our findings indicate that: (1) When using the greedy search strategy, GPT-4 outperforms 96% of humans, while GPT-3.5-turbo exceeds the average human level. (2) Stochastic sampling and temperature scaling are effective to obtain higher DAT scores for models except GPT-4, but face a trade-off between creativity and stability. These results imply that advanced large language models have divergent semantic associations, which is a fundamental process underlying creativity.