Timon Gehr

2papers

2 Papers

LGJul 23, 2020
Provably Robust Adversarial Examples

Dimitar I. Dimitrov, Gagandeep Singh, Timon Gehr et al.

We introduce the concept of provably robust adversarial examples for deep neural networks - connected input regions constructed from standard adversarial examples which are guaranteed to be robust to a set of real-world perturbations (such as changes in pixel intensity and geometric transformations). We present a novel method called PARADE for generating these regions in a scalable manner which works by iteratively refining the region initially obtained via sampling until a refined region is certified to be adversarial with existing state-of-the-art verifiers. At each step, a novel optimization procedure is applied to maximize the region's volume under the constraint that the convex relaxation of the network behavior with respect to the region implies a chosen bound on the certification objective. Our experimental evaluation shows the effectiveness of PARADE: it successfully finds large provably robust regions including ones containing $\approx 10^{573}$ adversarial examples for pixel intensity and $\approx 10^{599}$ for geometric perturbations. The provability enables our robust examples to be significantly more effective against state-of-the-art defenses based on randomized smoothing than the individual attacks used to construct the regions.

LGApr 30, 2020
Robustness Certification of Generative Models

Matthew Mirman, Timon Gehr, Martin Vechev

Generative neural networks can be used to specify continuous transformations between images via latent-space interpolation. However, certifying that all images captured by the resulting path in the image manifold satisfy a given property can be very challenging. This is because this set is highly non-convex, thwarting existing scalable robustness analysis methods, which are often based on convex relaxations. We present ApproxLine, a scalable certification method that successfully verifies non-trivial specifications involving generative models and classifiers. ApproxLine can provide both sound deterministic and probabilistic guarantees, by capturing either infinite non-convex sets of neural network activation vectors or distributions over such sets. We show that ApproxLine is practically useful and can verify interesting interpolations in the networks latent space.