LGAICRAug 3, 2023

URET: Universal Robustness Evaluation Toolkit (for Evasion)

arXiv:2308.01840v110 citationsh-index: 28
Originality Highly original
AI Analysis

This work addresses the need for universal robustness evaluation in AI, enabling adversarial testing for a wider range of systems beyond incremental improvements in image domains.

The authors tackled the problem of generating adversarial inputs for diverse AI systems beyond image classification by proposing a framework that uses pre-defined transformations to create semantically correct adversarial examples, demonstrating its generality across multiple tasks and input types.

Machine learning models are known to be vulnerable to adversarial evasion attacks as illustrated by image classification models. Thoroughly understanding such attacks is critical in order to ensure the safety and robustness of critical AI tasks. However, most evasion attacks are difficult to deploy against a majority of AI systems because they have focused on image domain with only few constraints. An image is composed of homogeneous, numerical, continuous, and independent features, unlike many other input types to AI systems used in practice. Furthermore, some input types include additional semantic and functional constraints that must be observed to generate realistic adversarial inputs. In this work, we propose a new framework to enable the generation of adversarial inputs irrespective of the input type and task domain. Given an input and a set of pre-defined input transformations, our framework discovers a sequence of transformations that result in a semantically correct and functional adversarial input. We demonstrate the generality of our approach on several diverse machine learning tasks with various input representations. We also show the importance of generating adversarial examples as they enable the deployment of mitigation techniques.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes