PatchDEMUX: A Certifiably Robust Framework for Multi-label Classifiers Against Adversarial Patches
This work addresses the vulnerability of multi-label classifiers to adversarial patches, providing a certifiable defense for applications like object detection, though it is incremental as it builds on existing single-label methods.
The paper tackles the problem of adversarial patch attacks on multi-label classifiers by introducing PatchDEMUX, a certifiably robust framework that extends single-label defenses to multi-label tasks, achieving non-trivial robustness on MS-COCO and PASCAL VOC datasets while maintaining high clean performance.
Deep learning techniques have enabled vast improvements in computer vision technologies. Nevertheless, these models are vulnerable to adversarial patch attacks which catastrophically impair performance. The physically realizable nature of these attacks calls for certifiable defenses, which feature provable guarantees on robustness. While certifiable defenses have been successfully applied to single-label classification, limited work has been done for multi-label classification. In this work, we present PatchDEMUX, a certifiably robust framework for multi-label classifiers against adversarial patches. Our approach is a generalizable method which can extend any existing certifiable defense for single-label classification; this is done by considering the multi-label classification task as a series of isolated binary classification problems to provably guarantee robustness. Furthermore, in the scenario where an attacker is limited to a single patch we propose an additional certification procedure that can provide tighter robustness bounds. Using the current state-of-the-art (SOTA) single-label certifiable defense PatchCleanser as a backbone, we find that PatchDEMUX can achieve non-trivial robustness on the MS-COCO and PASCAL VOC datasets while maintaining high clean performance