CL AIOct 6, 2023

Auto-survey Challenge

Thanh Gia Hieu Khuong, Benedictus Kent Rachmat

arXiv:2310.04480v20.91 citationsh-index: 2

Originality Synthesis-oriented

AI Analysis

This addresses the challenge of assessing AI's scholarly writing and review capabilities for researchers and practitioners in AI and related fields, though it is incremental as it builds on existing evaluation frameworks.

The authors introduced a platform to evaluate Large Language Models' ability to autonomously write and critique survey papers across multiple disciplines, using a simulated peer-review system with human oversight, and organized a competition at the AutoML 2023 conference to test models on these tasks.

We present a novel platform for evaluating the capability of Large Language Models (LLMs) to autonomously compose and critique survey papers spanning a vast array of disciplines including sciences, humanities, education, and law. Within this framework, AI systems undertake a simulated peer-review mechanism akin to traditional scholarly journals, with human organizers serving in an editorial oversight capacity. Within this framework, we organized a competition for the AutoML conference 2023. Entrants are tasked with presenting stand-alone models adept at authoring articles from designated prompts and subsequently appraising them. Assessment criteria include clarity, reference appropriateness, accountability, and the substantive value of the content. This paper presents the design of the competition, including the implementation baseline submissions and methods of evaluation.

View on arXiv PDF

Similar