CLNov 4, 2025

LiveSecBench: A Dynamic and Culturally-Relevant AI Safety Benchmark for LLMs in Chinese Context

Yudong Li, Zhongliang Yang, Kejiang Chen, Wenxuan Wang, Tianxin Zhang, Sifang Wan, Kecheng Wang, Haitian Li, Xu Wang, Lefan Cheng, Youdan Yang, Baocheng Chen

arXiv:2511.02366v1h-index: 4

Originality Synthesis-oriented

AI Analysis

This addresses the need for culturally-relevant safety evaluation for LLMs in Chinese applications, though it is incremental as it adapts existing benchmark concepts to a specific domain.

The authors introduced LiveSecBench, a dynamic safety benchmark for Chinese-language LLMs, evaluating 18 models across six dimensions rooted in Chinese legal and social frameworks to provide a landscape of AI safety in that context.

In this work, we propose LiveSecBench, a dynamic and continuously updated safety benchmark specifically for Chinese-language LLM application scenarios. LiveSecBench evaluates models across six critical dimensions (Legality, Ethics, Factuality, Privacy, Adversarial Robustness, and Reasoning Safety) rooted in the Chinese legal and social frameworks. This benchmark maintains relevance through a dynamic update schedule that incorporates new threat vectors, such as the planned inclusion of Text-to-Image Generation Safety and Agentic Safety in the next update. For now, LiveSecBench (v251030) has evaluated 18 LLMs, providing a landscape of AI safety in the context of Chinese language. The leaderboard is publicly accessible at https://livesecbench.intokentech.cn/.

View on arXiv PDF

Similar