AILGApr 20

Adversarial Arena: Crowdsourcing Data Generation through Interactive Competition

Amazon
arXiv:2604.1780395.0h-index: 16Has Code
Predicted impact top 9% in AI · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses the need for high-quality, diverse conversational data for LLM post-training, particularly in low-resource domains like cybersecurity, by introducing a novel crowdsourcing paradigm.

Adversarial Arena frames data generation as an adversarial competition between attacker and defender teams, producing diverse multi-turn conversational datasets. In a cybersecurity alignment competition with 10 academic teams, it generated 19,683 conversations, and fine-tuning on this data improved secure code generation by 18.47% on CyberSecEval-Instruct and 29.42% on CyberSecEval-MITRE.

Post-training Large Language Models requires diverse, high-quality data which is rare and costly to obtain, especially in low resource domains and for multi-turn conversations. Common solutions are crowdsourcing or synthetic generation, but both often yield low-quality or low-diversity data. We introduce Adversarial Arena for building high quality conversational datasets by framing data generation as an adversarial task: attackers create prompts, and defenders generate responses. This interactive competition between multiple teams naturally produces diverse and complex data. We validated this approach by conducting a competition with 10 academic teams from top US and European universities, each building attacker or defender bots. The competition, focused on safety alignment of LLMs in cybersecurity, generated 19,683 multi-turn conversations. Fine-tuning an open-source model on this dataset produced an 18.47% improvement in secure code generation on CyberSecEval-Instruct and 29.42% improvement on CyberSecEval-MITRE.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes