LGCRFeb 19, 2021

Fortify Machine Learning Production Systems: Detect and Classify Adversarial Attacks

arXiv:2102.09695v32 citations
Originality Synthesis-oriented
AI Analysis

This work addresses security vulnerabilities in deployed ML systems, but it is incremental as it builds on existing adversarial detection methods.

The paper tackles the problem of protecting production machine learning systems from adversarial attacks by detecting and classifying such attacks, enabling targeted robustness training and real-time filtering to prevent damage.

Production machine learning systems are consistently under attack by adversarial actors. Various deep learning models must be capable of accurately detecting fake or adversarial input while maintaining speed. In this work, we propose one piece of the production protection system: detecting an incoming adversarial attack and its characteristics. Detecting types of adversarial attacks has two primary effects: the underlying model can be trained in a structured manner to be robust from those attacks and the attacks can be potentially filtered out in real-time before causing any downstream damage. The adversarial image classification space is explored for models commonly used in transfer learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes