LinguistAgent: A Reflective Multi-Model Platform for Automated Linguistic Annotation

arXiv:2602.05493v1Has Code

Originality Incremental advance

AI Analysis

This addresses the problem of time-consuming manual annotation for researchers in the Humanities and Social Sciences, though it is incremental as it builds on existing LLM capabilities.

The paper tackles the bottleneck of data annotation in the Humanities and Social Sciences by introducing LinguistAgent, a platform that automates linguistic annotation, demonstrating its efficacy on metaphor identification with real-time token-level evaluation metrics like Precision, Recall, and F1 score.

Data annotation remains a significant bottleneck in the Humanities and Social Sciences, particularly for complex semantic tasks such as metaphor identification. While Large Language Models (LLMs) show promise, a significant gap remains between the theoretical capability of LLMs and their practical utility for researchers. This paper introduces LinguistAgent, an integrated, user-friendly platform that leverages a reflective multi-model architecture to automate linguistic annotation. The system implements a dual-agent workflow, comprising an Annotator and a Reviewer, to simulate a professional peer-review process. LinguistAgent supports comparative experiments across three paradigms: Prompt Engineering (Zero/Few-shot), Retrieval-Augmented Generation, and Fine-tuning. We demonstrate LinguistAgent's efficacy using the task of metaphor identification as an example, providing real-time token-level evaluation (Precision, Recall, and $F_1$ score) against human gold standards. The application and codes are released on https://github.com/Bingru-Li/LinguistAgent.

View on arXiv PDF Code

Similar