LGMLFeb 21, 2023

Generalization Bounds for Adversarial Contrastive Learning

arXiv:2302.10633v112 citationsh-index: 5
Originality Synthesis-oriented
AI Analysis

This work addresses a theoretical gap for researchers in robust machine learning, providing foundational insights into adversarial contrastive learning, though it is incremental as it builds on existing methods.

The paper tackles the lack of theoretical understanding in adversarial contrastive learning by analyzing its generalization performance using Rademacher complexity, showing that the average adversarial risk of downstream tasks is upper bounded by the adversarial unsupervised risk of the upstream task, with experimental validation.

Deep networks are well-known to be fragile to adversarial attacks, and adversarial training is one of the most popular methods used to train a robust model. To take advantage of unlabeled data, recent works have applied adversarial training to contrastive learning (Adversarial Contrastive Learning; ACL for short) and obtain promising robust performance. However, the theory of ACL is not well understood. To fill this gap, we leverage the Rademacher complexity to analyze the generalization performance of ACL, with a particular focus on linear models and multi-layer neural networks under $\ell_p$ attack ($p \ge 1$). Our theory shows that the average adversarial risk of the downstream tasks can be upper bounded by the adversarial unsupervised risk of the upstream task. The experimental results validate our theory.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes