LG MLFeb 7, 2020

Semantic Robustness of Models of Source Code

Goutham Ramakrishnan, Jordan Henkel, Zi Wang, Aws Albarghouthi, Somesh Jha, Thomas Reps

arXiv:2002.03043v221.9122 citations

Originality Incremental advance

AI Analysis

This addresses the problem of adversarial robustness for developers and researchers using AI in code analysis, though it is incremental as it applies known adversarial training techniques to a new domain.

The paper tackles the vulnerability of deep neural networks to adversarial examples in source code models by defining a semantics-preserving adversary and using adversarial training, resulting in significant quantitative gains in robustness across languages and architectures.

Deep neural networks are vulnerable to adversarial examples - small input perturbations that result in incorrect predictions. We study this problem for models of source code, where we want the network to be robust to source-code modifications that preserve code functionality. (1) We define a powerful adversary that can employ sequences of parametric, semantics-preserving program transformations; (2) we show how to perform adversarial training to learn models robust to such adversaries; (3) we conduct an evaluation on different languages and architectures, demonstrating significant quantitative gains in robustness.

View on arXiv PDF

Similar