AIMay 23

JT-SAFE-V2: Safety-by-Design Foundation Model with World-Context Data

Junlan Feng, Fanyu Meng, Chong Long, Pengyu Cong, Duqing Wang, Yan Zheng, Yuyao Zhang, Xuanchang Gao, Ye Yuan, Yunfei Ma, Zhijie Ren, Fan Yang

arXiv:2605.2441497.4

Predicted impact top 16% in AI · last 90 daysOriginality Incremental advance

AI Analysis

For enterprises deploying LLMs, this work provides a practical safety-by-design approach with measurable cost savings, though it is an incremental extension of prior JT-Safe work.

JT-Safe-V2 jointly optimizes general intelligence and safety in LLMs via enriched pre-training data, high-certainty training, and safety post-training, achieving SOTA on both general and safety benchmarks. Its Safe-MoMA framework reduces inference costs by over 30% while maintaining performance.

We introduce JT-Safe-V2, a large language model designed to advance the safety and trustworthiness of foundation models, extending our previous JT-Safe model toward a more comprehensive safety-by-design paradigm. JT-Safe-V2 emphasizes the joint optimization of general intelligence and safety-by-design through several key innovations: enriching pre-training data with contextual world knowledge, high-certainty pre-training procedures, and safety strengthening post-training mechanisms for enterprise-oriented agentic capabilities. Building on these safety-enhanced foundation models, we propose Safe-MoMA (Safe Mixture of Models and Agents), a framework that enables traceable and efficient inference through the orchestrated deployment of multiple models and agents. Extensive evaluations demonstrate that JT-Safe-V2 achieves state-of-the-art performance across both general intelligence and safety benchmarks. Moreover, Safe-MoMA reduces inference costs by more than 30\% compared to using the largest standalone model baseline while maintaining comparable performance. To facilitate future research on safety-by-design foundation models, we publicly release the post-trained JT-Safe-V2-35B model checkpoint.

View on arXiv PDF

Similar