Robust Deep Learning as Optimal Control: Insights and Convergence Guarantees
This work addresses the problem of slow adversarial training for deep learning practitioners, offering theoretical insights and convergence guarantees, though it is incremental as it builds on prior optimal control formulations.
The paper tackles the computational inefficiency of adversarial training for deep neural networks by framing it as an optimal control problem, providing the first convergence analysis that shows how hyperparameters affect stability and convergence, supported by experiments on a robust classification problem.
The fragility of deep neural networks to adversarially-chosen inputs has motivated the need to revisit deep learning algorithms. Including adversarial examples during training is a popular defense mechanism against adversarial attacks. This mechanism can be formulated as a min-max optimization problem, where the adversary seeks to maximize the loss function using an iterative first-order algorithm while the learner attempts to minimize it. However, finding adversarial examples in this way causes excessive computational overhead during training. By interpreting the min-max problem as an optimal control problem, it has recently been shown that one can exploit the compositional structure of neural networks in the optimization problem to improve the training time significantly. In this paper, we provide the first convergence analysis of this adversarial training algorithm by combining techniques from robust optimal control and inexact oracle methods in optimization. Our analysis sheds light on how the hyperparameters of the algorithm affect the its stability and convergence. We support our insights with experiments on a robust classification problem.