Robustness to Adversarial Perturbations in Learning from Incomplete Data
This work addresses robustness in machine learning for scenarios with incomplete data, but it is incremental as it builds on existing frameworks without a major breakthrough.
The paper tackles the problem of learning from incomplete data under adversarial perturbations by unifying semi-supervised and distributionally robust learning frameworks, developing a generalization theory with novel complexity measures and a hybrid algorithm that shows comparable performance to state-of-the-art methods on real-world benchmarks.
What is the role of unlabeled data in an inference problem, when the presumed underlying distribution is adversarially perturbed? To provide a concrete answer to this question, this paper unifies two major learning frameworks: Semi-Supervised Learning (SSL) and Distributionally Robust Learning (DRL). We develop a generalization theory for our framework based on a number of novel complexity measures, such as an adversarial extension of Rademacher complexity and its semi-supervised analogue. Moreover, our analysis is able to quantify the role of unlabeled data in the generalization under a more general condition compared to the existing theoretical works in SSL. Based on our framework, we also present a hybrid of DRL and EM algorithms that has a guaranteed convergence rate. When implemented with deep neural networks, our method shows a comparable performance to those of the state-of-the-art on a number of real-world benchmark datasets.