Training Provably Robust Models by Polyhedral Envelope Regularization
This work addresses the need for more efficient and effective provable robustness in neural networks for security-critical applications, representing an incremental improvement over existing methods.
The authors tackled the problem of improving certified robustness of neural networks against adversarial attacks by introducing polyhedral envelope regularization (PER), which encourages larger polyhedral envelopes around input data to provide finer-grained robustness guarantees. The result showed that PER has very little computational overhead and better robustness guarantees compared to state-of-the-art methods, without over-regularizing the model.
Training certifiable neural networks enables one to obtain models with robustness guarantees against adversarial attacks. In this work, we introduce a framework to bound the adversary-free region in the neighborhood of the input data by a polyhedral envelope, which yields finer-grained certified robustness. We further introduce polyhedral envelope regularization (PER) to encourage larger polyhedral envelopes and thus improve the provable robustness of the models. We demonstrate the flexibility and effectiveness of our framework on standard benchmarks; it applies to networks of different architectures and general activation functions. Compared with the state-of-the-art methods, PER has very little computational overhead and better robustness guarantees without over-regularizing the model.