AICLDec 28, 2024

BaiJia: A Large-Scale Role-Playing Agent Corpus of Chinese Historical Characters

arXiv:2412.20024v25 citationsh-index: 2
Originality Synthesis-oriented
AI Analysis

This work addresses the need for low-resource data to develop AI-driven historical role-playing agents, which is incremental as it compiles existing information into a structured corpus for LLMs.

The authors tackled the problem of fragmented historical textual records by introducing BaiJia, a large-scale corpus of Chinese historical characters, and demonstrated its effectiveness in enhancing the role-playing abilities of foundational LLMs, with experiments showing improved performance in historical role-playing tasks.

We introduce a comprehensive large-scale role-playing agent corpus, termed BaiJia, that comprises various Chinese historical characters. This corpus is noteworthy for being the pioneering compilation of low-resource data that can be utilized in large language models (LLMs) to engage in AI-driven historical role-playing agents. BaiJia addresses the challenges in terms of fragmented historical textual records in different forms and modalities, integrating various characters' information, including their biographical, literary, family relations, historical events, and so on. We conduct extensive experiments to demonstrate the effectiveness of our BaiJia agent corpus in bolstering the role-playing abilities of various foundational LLMs, and promoting the development and assessment of LLMs in the context of historical role-playing tasks. The agent corpus is available at baijia.online.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes