Adversarial Robustness of Supervised Sparse Coding
This work addresses the challenge of adversarial examples for practitioners using sparse coding models, offering theoretical insights and practical certificates, though it is incremental by building on existing robustness frameworks.
The paper tackles the problem of adversarial robustness in machine learning by analyzing a supervised sparse coding model, providing a generalization bound and robustness certificate for end-to-end classification, and demonstrates applicability with certified accuracy on real data, comparing favorably to other methods.
Several recent results provide theoretical insights into the phenomena of adversarial examples. Existing results, however, are often limited due to a gap between the simplicity of the models studied and the complexity of those deployed in practice. In this work, we strike a better balance by considering a model that involves learning a representation while at the same time giving a precise generalization bound and a robustness certificate. We focus on the hypothesis class obtained by combining a sparsity-promoting encoder coupled with a linear classifier, and show an interesting interplay between the expressivity and stability of the (supervised) representation map and a notion of margin in the feature space. We bound the robust risk (to $\ell_2$-bounded perturbations) of hypotheses parameterized by dictionaries that achieve a mild encoder gap on training data. Furthermore, we provide a robustness certificate for end-to-end classification. We demonstrate the applicability of our analysis by computing certified accuracy on real data, and compare with other alternatives for certified robustness.