MLLGJun 8, 2020

Distributional Robustness with IPMs and links to Regularization and GANs

arXiv:2006.04349v124 citations
Originality Incremental advance
AI Analysis

This work provides theoretical insights into robustness-regularization connections for researchers in adversarial robustness and generative modeling, though it appears incremental as it extends existing DRO frameworks.

The paper tackles the problem of distributional robustness in machine learning by studying uncertainty sets constructed with Integral Probability Metrics (IPMs), showing that Distributionally Robust Optimization (DRO) under any IPM corresponds to a family of regularization penalties that recover and improve upon existing results for MMD and Wasserstein distances, and extends these insights to reveal connections between GANs and distributional robustness.

Robustness to adversarial attacks is an important concern due to the fragility of deep neural networks to small perturbations and has received an abundance of attention in recent years. Distributionally Robust Optimization (DRO), a particularly promising way of addressing this challenge, studies robustness via divergence-based uncertainty sets and has provided valuable insights into robustification strategies such as regularization. In the context of machine learning, the majority of existing results have chosen $f$-divergences, Wasserstein distances and more recently, the Maximum Mean Discrepancy (MMD) to construct uncertainty sets. We extend this line of work for the purposes of understanding robustness via regularization by studying uncertainty sets constructed with Integral Probability Metrics (IPMs) - a large family of divergences including the MMD, Total Variation and Wasserstein distances. Our main result shows that DRO under \textit{any} choice of IPM corresponds to a family of regularization penalties, which recover and improve upon existing results in the setting of MMD and Wasserstein distances. Due to the generality of our result, we show that other choices of IPMs correspond to other commonly used penalties in machine learning. Furthermore, we extend our results to shed light on adversarial generative modelling via $f$-GANs, constituting the first study of distributional robustness for the $f$-GAN objective. Our results unveil the inductive properties of the discriminator set with regards to robustness, allowing us to give positive comments for several penalty-based GAN methods such as Wasserstein-, MMD- and Sobolev-GANs. In summary, our results intimately link GANs to distributional robustness, extend previous results on DRO and contribute to our understanding of the link between regularization and robustness at large.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes