CLAIJun 18, 2024

CollabStory: Multi-LLM Collaborative Story Generation and Authorship Analysis

arXiv:2406.12665v341 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses the emerging issue of LLM-LLM collaboration in writing tasks, which could impact plagiarism detection and academic integrity, though it is incremental as it extends existing human-human authorship analysis methods to LLMs.

The paper tackles the problem of multi-LLM collaborative story generation by creating the first dataset of exclusively LLM-generated collaborative stories, called CollabStory, with over 32k stories, and finds that current baselines are inadequate for analyzing authorship in this scenario.

The rise of unifying frameworks that enable seamless interoperability of Large Language Models (LLMs) has made LLM-LLM collaboration for open-ended tasks a possibility. Despite this, there have not been efforts to explore such collaborative writing. We take the next step beyond human-LLM collaboration to explore this multi-LLM scenario by generating the first exclusively LLM-generated collaborative stories dataset called CollabStory. We focus on single-author to multi-author (up to 5 LLMs) scenarios, where multiple LLMs co-author stories. We generate over 32k stories using open-source instruction-tuned LLMs. Further, we take inspiration from the PAN tasks that have set the standard for human-human multi-author writing tasks and analysis. We extend their authorship-related tasks for multi-LLM settings and present baselines for LLM-LLM collaboration. We find that current baselines are not able to handle this emerging scenario. Thus, CollabStory is a resource that could help propel an understanding as well as the development of new techniques to discern the use of multiple LLMs. This is crucial to study in the context of writing tasks since LLM-LLM collaboration could potentially overwhelm ongoing challenges related to plagiarism detection, credit assignment, maintaining academic integrity in educational settings, and addressing copyright infringement concerns. We make our dataset and code available at https://github.com/saranya-venkatraman/CollabStory.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes