DLAIOct 25, 2024

Assessing the societal influence of academic research with ChatGPT: Impact case study evaluations

arXiv:2410.19948v118 citationsh-index: 91Journal of the Association for Information Science and Technology
Originality Incremental advance
AI Analysis

This work addresses the challenge of automating societal impact evaluation for academics and departments, though it is incremental as it builds on existing AI tools for assessment support.

This study investigated whether ChatGPT could evaluate societal impact claims in academic research, specifically Impact Case Studies (ICS) from the UK Research Excellence Framework, by comparing its scores with expert assessments and found correlations ranging from 0.18 to 0.71 across different fields.

Academics and departments are sometimes judged by how their research has benefitted society. For example, the UK Research Excellence Framework (REF) assesses Impact Case Studies (ICS), which are five-page evidence-based claims of societal impacts. This study investigates whether ChatGPT can evaluate societal impact claims and therefore potentially support expert human assessors. For this, various parts of 6,220 public ICS from REF2021 were fed to ChatGPT 4o-mini along with the REF2021 evaluation guidelines, comparing the results with published departmental average ICS scores. The results suggest that the optimal strategy for high correlations with expert scores is to input the title and summary of an ICS but not the remaining text, and to modify the original REF guidelines to encourage a stricter evaluation. The scores generated by this approach correlated positively with departmental average scores in all 34 Units of Assessment (UoAs), with values between 0.18 (Economics and Econometrics) and 0.56 (Psychology, Psychiatry and Neuroscience). At the departmental level, the corresponding correlations were higher, reaching 0.71 for Sport and Exercise Sciences, Leisure and Tourism. Thus, ChatGPT-based ICS evaluations are simple and viable to support or cross-check expert judgments, although their value varies substantially between fields.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes