AISep 17, 2025

CrowdAgent: Multi-Agent Managed Multi-Source Annotation System

arXiv:2509.14030v12 citationsh-index: 13Has CodeEMNLP
Originality Incremental advance
AI Analysis

This addresses the need for holistic process control in data annotation for NLP researchers and practitioners, though it is incremental as it builds on existing multi-source annotation methods.

The paper tackles the problem of managing diverse annotation sources like LLMs, SLMs, and human experts in NLP by introducing CrowdAgent, a multi-agent system for end-to-end process control, and demonstrates its effectiveness on six multimodal classification tasks.

High-quality annotated data is a cornerstone of modern Natural Language Processing (NLP). While recent methods begin to leverage diverse annotation sources-including Large Language Models (LLMs), Small Language Models (SLMs), and human experts-they often focus narrowly on the labeling step itself. A critical gap remains in the holistic process control required to manage these sources dynamically, addressing complex scheduling and quality-cost trade-offs in a unified manner. Inspired by real-world crowdsourcing companies, we introduce CrowdAgent, a multi-agent system that provides end-to-end process control by integrating task assignment, data annotation, and quality/cost management. It implements a novel methodology that rationally assigns tasks, enabling LLMs, SLMs, and human experts to advance synergistically in a collaborative annotation workflow. We demonstrate the effectiveness of CrowdAgent through extensive experiments on six diverse multimodal classification tasks. The source code and video demo are available at https://github.com/QMMMS/CrowdAgent.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes