Collective Robustness Certificates: Exploiting Interdependence in Graph Neural Networks
This addresses the robustness certification problem for tasks like node classification, image segmentation, and named-entity recognition, offering a significant improvement over existing methods.
The paper tackles the problem of overly pessimistic adversarial robustness certificates for tasks with multiple simultaneous predictions by proposing the first collective robustness certificate that guarantees stability across predictions under perturbation, increasing the average number of certifiable feature perturbations from 7 to 351 on the Citeseer dataset.
In tasks like node classification, image segmentation, and named-entity recognition we have a classifier that simultaneously outputs multiple predictions (a vector of labels) based on a single input, i.e. a single graph, image, or document respectively. Existing adversarial robustness certificates consider each prediction independently and are thus overly pessimistic for such tasks. They implicitly assume that an adversary can use different perturbed inputs to attack different predictions, ignoring the fact that we have a single shared input. We propose the first collective robustness certificate which computes the number of predictions that are simultaneously guaranteed to remain stable under perturbation, i.e. cannot be attacked. We focus on Graph Neural Networks and leverage their locality property - perturbations only affect the predictions in a close neighborhood - to fuse multiple single-node certificates into a drastically stronger collective certificate. For example, on the Citeseer dataset our collective certificate for node classification increases the average number of certifiable feature perturbations from $7$ to $351$.