9.6CYApr 25
AI Integrity: Defending Against Backdoors and Secret LoyaltiesDave Banerjee, Onni Aarne
AI integrity means ensuring AI systems are free from secret or unauthorized modifications that could compromise their behavior. Integrity represents one pillar of the confidentiality, integrity, and availability (CIA) triad in information security: confidentiality preserves secrecy of sensitive information, integrity ensures data remain authentic and uncorrupted, and availability keeps systems operational when needed. While confidentiality receives some attention through efforts like RAND's Securing AI Model Weights report, and availability is naturally prioritized by market forces, AI integrity receives insufficient attention despite its importance to national security.