Weixuan Wan

h-index3
2papers

2 Papers

64.1CLJun 3
Off-Distribution Voices: Fanfiction Subgenres as Universal Vernacular Jailbreaks for Aligned LLMs

Zhongze Luo, Ruihe Shi, Zhenshuai Yin et al.

Existing jailbreaks against aligned LLMs are discrete artifacts whose surface forms are easy to fingerprint and patch. We argue that the real failure mode is not any specific prompt, but an entire register of natural human writing that safety training has under-covered. Building on this insight, we introduce the first jailbreak family that uses real fanfiction subgenres as universal attack carriers: a creative-writing meta is conditioned on passages from one of twelve Archive of Our Own (AO3) subgenres, and the harmful behavior is embedded as the climax of the resulting scene. The construction requires no attacker LLM and no per-target adaptation. On eight aligned LLMs over the union of HarmBench and JailbreakBench, this attack lifts mean ASR from 0.278 to 0.731 under a four-judge ensemble; a factorial decomposition shows the gain is carried by register rather than length or structure. Two active defences widen rather than narrow the vernacular-to-baseline ratio, indicating that template-targeting defences merely steer attackers toward register-based attacks like ours. We also propose SAGA-A4, a static four-turn extension that attains mean ASR 0.924, substantially exceeding three existing multi-turn methods.

CLJun 8, 2025Code
KG2QA: Knowledge Graph-enhanced Retrieval-augmented Generation for Communication Standards Question Answering

Zhongze Luo, Weixuan Wan, Tianya Zhang et al.

The rapid evolution of communication technologies has led to an explosion of standards, rendering traditional expert-dependent consultation methods inefficient and slow. To address this challenge, we propose \textbf{KG2QA}, a question answering (QA) framework for communication standards that integrates fine-tuned large language models (LLMs) with a domain-specific knowledge graph (KG) via a retrieval-augmented generation (RAG) pipeline. We construct a high-quality dataset of 6,587 QA pairs from ITU-T recommendations and fine-tune Qwen2.5-7B-Instruct, achieving significant performance gains: BLEU-4 increases from 18.86 to 66.90, outperforming both the base model and Llama-3-8B-Instruct. A structured KG containing 13,906 entities and 13,524 relations is built using LLM-assisted triple extraction based on a custom ontology. In our KG-RAG pipeline, the fine-tuned LLMs first retrieves relevant knowledge from KG, enabling more accurate and factually grounded responses. Evaluated by DeepSeek-V3 as a judge, the KG-enhanced system improves performance across five dimensions, with an average score increase of 2.26\%, demonstrating superior factual accuracy and relevance. Integrated with Web platform and API, KG2QA delivers an efficient and interactive user experience. Our code and data have been open-sourced https://github.com/luozhongze/KG2QA.