LGSep 8, 2025

\texttt{R$^\textbf{2}$AI}: Towards Resistant and Resilient AI in an Evolving World

Youbang Sun, Xiang Wang, Jie Fu, Chaochao Lu, Bowen Zhou

arXiv:2509.06786v19.42 citationsh-index: 8

Originality Highly original

AI Analysis

This addresses the problem of AI safety for developers and society, offering a proactive approach to mitigate vulnerabilities and existential risks as AI advances, though it is a conceptual framework rather than an incremental improvement.

The paper tackles the gap between AI capabilities and safety by proposing a new 'safe-by-coevolution' paradigm, which frames safety as a dynamic learning process to address both known and unforeseen risks in evolving environments.

In this position paper, we address the persistent gap between rapidly growing AI capabilities and lagging safety progress. Existing paradigms divide into ``Make AI Safe'', which applies post-hoc alignment and guardrails but remains brittle and reactive, and ``Make Safe AI'', which emphasizes intrinsic safety but struggles to address unforeseen risks in open-ended environments. We therefore propose \textit{safe-by-coevolution} as a new formulation of the ``Make Safe AI'' paradigm, inspired by biological immunity, in which safety becomes a dynamic, adversarial, and ongoing learning process. To operationalize this vision, we introduce \texttt{R$^2$AI} -- \textit{Resistant and Resilient AI} -- as a practical framework that unites resistance against known threats with resilience to unforeseen risks. \texttt{R$^2$AI} integrates \textit{fast and slow safe models}, adversarial simulation and verification through a \textit{safety wind tunnel}, and continual feedback loops that guide safety and capability to coevolve. We argue that this framework offers a scalable and proactive path to maintain continual safety in dynamic environments, addressing both near-term vulnerabilities and long-term existential risks as AI advances toward AGI and ASI.

View on arXiv PDF

Similar