CLAILGNov 11, 2024

LongSafety: Enhance Safety for Long-Context LLMs

arXiv:2411.06899v27 citationsh-index: 12
Originality Incremental advance
AI Analysis

This addresses safety concerns for users of long-context LLMs in complex applications, though it is incremental as it builds on existing safety alignment research.

The paper tackles the underexplored safety issues in long-context large language models by introducing LongSafety, a comprehensive safety alignment dataset with 10 tasks and 17k samples averaging 40.9k tokens. Experiments show training with LongSafety enhances long-context safety performance while improving short-context safety and preserving general capabilities.

Recent advancements in model architectures and length extrapolation techniques have significantly extended the context length of large language models (LLMs), paving the way for their application in increasingly complex tasks. However, despite the growing capabilities of long-context LLMs, the safety issues in long-context scenarios remain underexplored. While safety alignment in short context has been widely studied, the safety concerns of long-context LLMs have not been adequately addressed. In this work, we introduce \textbf{LongSafety}, a comprehensive safety alignment dataset for long-context LLMs, containing 10 tasks and 17k samples, with an average length of 40.9k tokens. Our experiments demonstrate that training with LongSafety can enhance long-context safety performance while enhancing short-context safety and preserving general capabilities. Furthermore, we demonstrate that long-context safety does not equal long-context alignment with short-context safety data and LongSafety has generalizing capabilities in context length and long-context safety scenarios.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes