LG CR MLMar 4, 2024

Robustness bounds on the successful adversarial examples in probabilistic models: Implications from Gaussian processes

arXiv:2403.01896v2h-index: 1JSAI-isAI

AI Analysis

This work provides theoretical robustness guarantees for probabilistic models against adversarial attacks, which is incremental but important for security in machine learning applications.

The paper tackled the problem of adversarial examples in machine learning by deriving an upper bound on the probability of successful attacks for Gaussian Process classification, showing that this bound depends on perturbation norm, kernel function, and training data distance, and confirmed the theoretical result experimentally on ImageNet.

Adversarial example (AE) is an attack method for machine learning, which is crafted by adding imperceptible perturbation to the data inducing misclassification. In the current paper, we investigated the upper bound of the probability of successful AEs based on the Gaussian Process (GP) classification, a probabilistic inference model. We proved a new upper bound of the probability of a successful AE attack that depends on AE's perturbation norm, the kernel function used in GP, and the distance of the closest pair with different labels in the training dataset. Surprisingly, the upper bound is determined regardless of the distribution of the sample dataset. We showed that our theoretical result was confirmed through the experiment using ImageNet. In addition, we showed that changing the parameters of the kernel function induces a change of the upper bound of the probability of successful AEs.

View on arXiv PDF

Similar