LGApr 5, 2023

Hyper-parameter Tuning for Adversarially Robust Models

Pedro Mendes, Paolo Romano, David Garlan

arXiv:2304.02497v33.84 citationsh-index: 74Has Code

Originality Incremental advance

AI Analysis

It addresses the challenge of efficiently tuning hyper-parameters for robust machine learning models, which is incremental but important for improving adversarial robustness.

This work tackles hyper-parameter tuning for adversarially robust models, showing that tuning hyper-parameters separately for standard and adversarial training phases can reduce error by up to 80% for clean inputs and 43% for adversarial inputs, and proposes using cheap adversarial training methods to enhance tuning efficiency by up to 2.1x.

This work focuses on the problem of hyper-parameter tuning (HPT) for robust (i.e., adversarially trained) models, shedding light on the new challenges and opportunities arising during the HPT process for robust models. To this end, we conduct an extensive experimental study based on 3 popular deep models, in which we explore exhaustively 9 (discretized) HPs, 2 fidelity dimensions, and 2 attack bounds, for a total of 19208 configurations (corresponding to 50 thousand GPU hours). Through this study, we show that the complexity of the HPT problem is further exacerbated in adversarial settings due to the need to independently tune the HPs used during standard and adversarial training: succeeding in doing so (i.e., adopting different HP settings in both phases) can lead to a reduction of up to 80% and 43% of the error for clean and adversarial inputs, respectively. On the other hand, we also identify new opportunities to reduce the cost of HPT for robust models. Specifically, we propose to leverage cheap adversarial training methods to obtain inexpensive, yet highly correlated, estimations of the quality achievable using state-of-the-art methods. We show that, by exploiting this novel idea in conjunction with a recent multi-fidelity optimizer (taKG), the efficiency of the HPT process can be enhanced by up to 2.1x.

View on arXiv PDF Code

Similar