Security-First AI: Foundations for Robust and Trustworthy Systems
It addresses the problem of securing AI systems from adversarial manipulation for researchers and practitioners, but is incremental as it builds on existing discussions without introducing new methods or data.
The paper argues that AI security should be prioritized as a foundational layer to enable trustworthy and resilient AI systems, distinguishing it from safety and discussing threat models, attack vectors, and defense mechanisms.
The conversation around artificial intelligence (AI) often focuses on safety, transparency, accountability, alignment, and responsibility. However, AI security (i.e., the safeguarding of data, models, and pipelines from adversarial manipulation) underpins all of these efforts. This manuscript posits that AI security must be prioritized as a foundational layer. We present a hierarchical view of AI challenges, distinguishing security from safety, and argue for a security-first approach to enable trustworthy and resilient AI systems. We discuss core threat models, key attack vectors, and emerging defense mechanisms, concluding that a metric-driven approach to AI security is essential for robust AI safety, transparency, and accountability.