LGMLOct 9, 2023

Theoretical Analysis of Robust Overfitting for Wide DNNs: An NTK Approach

arXiv:2310.06112v29 citationsh-index: 2Has Code
Originality Incremental advance
AI Analysis

This addresses the problem of robust overfitting in adversarial training for deep learning practitioners, offering a theoretical foundation and a novel algorithm, though it is incremental in extending existing NTK theory.

The paper provides a theoretical explanation for robust overfitting in adversarially trained deep neural networks by extending neural tangent kernel theory to adversarial training, proving that wide DNNs can be approximated by linearized models and revealing that long-term training degenerates to non-robust solutions. Experiments show that their Adv-NTK method enables infinite-width DNNs to achieve comparable robustness to finite-width ones.

Adversarial training (AT) is a canonical method for enhancing the robustness of deep neural networks (DNNs). However, recent studies empirically demonstrated that it suffers from robust overfitting, i.e., a long time AT can be detrimental to the robustness of DNNs. This paper presents a theoretical explanation of robust overfitting for DNNs. Specifically, we non-trivially extend the neural tangent kernel (NTK) theory to AT and prove that an adversarially trained wide DNN can be well approximated by a linearized DNN. Moreover, for squared loss, closed-form AT dynamics for the linearized DNN can be derived, which reveals a new AT degeneration phenomenon: a long-term AT will result in a wide DNN degenerates to that obtained without AT and thus cause robust overfitting. Based on our theoretical results, we further design a method namely Adv-NTK, the first AT algorithm for infinite-width DNNs. Experiments on real-world datasets show that Adv-NTK can help infinite-width DNNs enhance comparable robustness to that of their finite-width counterparts, which in turn justifies our theoretical findings. The code is available at https://github.com/fshp971/adv-ntk.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes