LG AI MLOct 3, 2020

Interpreting Robust Optimization via Adversarial Influence Functions

Zhun Deng, Cynthia Dwork, Jialiang Wang, Linjun Zhang

arXiv:2010.01247v19.613 citations

Originality Incremental advance

AI Analysis

This work provides a tool for interpreting robust optimization in adversarial training, which is incremental for researchers and practitioners in machine learning.

The paper tackles the problem of quantifying how robust optimization changes optimizers and prediction losses compared to standard training by introducing the Adversarial Influence Function (AIF) as an interpretability tool. It applies AIF to analyze model sensitivity, derives it for kernel regressions including neural tangent kernels, and demonstrates effectiveness experimentally.

Robust optimization has been widely used in nowadays data science, especially in adversarial training. However, little research has been done to quantify how robust optimization changes the optimizers and the prediction losses comparing to standard training. In this paper, inspired by the influence function in robust statistics, we introduce the Adversarial Influence Function (AIF) as a tool to investigate the solution produced by robust optimization. The proposed AIF enjoys a closed-form and can be calculated efficiently. To illustrate the usage of AIF, we apply it to study model sensitivity -- a quantity defined to capture the change of prediction losses on the natural data after implementing robust optimization. We use AIF to analyze how model complexity and randomized smoothing affect the model sensitivity with respect to specific models. We further derive AIF for kernel regressions, with a particular application to neural tangent kernels, and experimentally demonstrate the effectiveness of the proposed AIF. Lastly, the theories of AIF will be extended to distributional robust optimization.

View on arXiv PDF

Similar