Shallow Synthesis of Knowledge in GPT-Generated Texts: A Case Study in Automatic Related Work Composition
This addresses the problem of AI-generated text quality in academic writing for researchers, highlighting incremental insights on limitations.
The study analyzed AI-assisted scholarly writing using ScholaCite, comparing human-written, GPT-generated, and human-AI collaborative texts, finding that GPT-4 can generate reasonable citation groupings for brainstorming but fails at detailed synthesis without human intervention.
Numerous AI-assisted scholarly applications have been developed to aid different stages of the research process. We present an analysis of AI-assisted scholarly writing generated with ScholaCite, a tool we built that is designed for organizing literature and composing Related Work sections for academic papers. Our evaluation method focuses on the analysis of citation graphs to assess the structural complexity and inter-connectedness of citations in texts and involves a three-way comparison between (1) original human-written texts, (2) purely GPT-generated texts, and (3) human-AI collaborative texts. We find that GPT-4 can generate reasonable coarse-grained citation groupings to support human users in brainstorming, but fails to perform detailed synthesis of related works without human intervention. We suggest that future writing assistant tools should not be used to draft text independently of the human author.