LGCVMLMay 5, 2017

Detecting Adversarial Samples Using Density Ratio Estimates

arXiv:1705.02224v42 citations
Originality Incremental advance
AI Analysis

This addresses the critical issue of adversarial vulnerability in ML systems, which is incremental as it builds on existing detection approaches.

The paper tackles the problem of detecting adversarial samples in machine learning models by using direct density ratio estimation as a model-agnostic measure, achieving effective detection across various sample types and adversarial generation methods.

Machine learning models, especially based on deep architectures are used in everyday applications ranging from self driving cars to medical diagnostics. It has been shown that such models are dangerously susceptible to adversarial samples, indistinguishable from real samples to human eye, adversarial samples lead to incorrect classifications with high confidence. Impact of adversarial samples is far-reaching and their efficient detection remains an open problem. We propose to use direct density ratio estimation as an efficient model agnostic measure to detect adversarial samples. Our proposed method works equally well with single and multi-channel samples, and with different adversarial sample generation methods. We also propose a method to use density ratio estimates for generating adversarial samples with an added constraint of preserving density ratio.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes