CLAPApr 28, 2025

Enhancing Systematic Reviews with Large Language Models: Using GPT-4 and Kimi

arXiv:2504.20276v1
Originality Synthesis-oriented
AI Analysis

This work addresses the efficiency of systematic reviews for researchers, but it is incremental as it applies existing LLMs to a specific domain.

The study investigated GPT-4 and Kimi for systematic reviews by comparing their generated codes to human codes, finding that LLM performance varies with data volume and question complexity.

This research delved into GPT-4 and Kimi, two Large Language Models (LLMs), for systematic reviews. We evaluated their performance by comparing LLM-generated codes with human-generated codes from a peer-reviewed systematic review on assessment. Our findings suggested that the performance of LLMs fluctuates by data volume and question complexity for systematic reviews.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes