IRAICLMar 5, 2024

ChatCite: LLM Agent with Human Workflow Guidance for Comparative Literature Summary

Tsinghua
arXiv:2403.02574v147 citationsh-index: 21COLING
Originality Incremental advance
AI Analysis

This work addresses the time-consuming task of literature summarization for researchers, though it appears incremental as it builds on existing LLM-based methods by focusing on the summarization step.

The authors tackled the challenge of generating comparative literature summaries by introducing ChatCite, an LLM agent guided by human workflow, which outperformed other models in experiments using an automatic evaluation metric called G-Score.

The literature review is an indispensable step in the research process. It provides the benefit of comprehending the research problem and understanding the current research situation while conducting a comparative analysis of prior works. However, literature summary is challenging and time consuming. The previous LLM-based studies on literature review mainly focused on the complete process, including literature retrieval, screening, and summarization. However, for the summarization step, simple CoT method often lacks the ability to provide extensive comparative summary. In this work, we firstly focus on the independent literature summarization step and introduce ChatCite, an LLM agent with human workflow guidance for comparative literature summary. This agent, by mimicking the human workflow, first extracts key elements from relevant literature and then generates summaries using a Reflective Incremental Mechanism. In order to better evaluate the quality of the generated summaries, we devised a LLM-based automatic evaluation metric, G-Score, in refer to the human evaluation criteria. The ChatCite agent outperformed other models in various dimensions in the experiments. The literature summaries generated by ChatCite can also be directly used for drafting literature reviews.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes