CLSep 12, 2022

Semantic-Preserving Adversarial Code Comprehension

arXiv:2209.05130v1583 citationsh-index: 14
Originality Incremental advance
AI Analysis

This addresses the problem of balancing generalization and adversarial robustness for developers and researchers in code analysis, representing an incremental advance by combining aspects of existing approaches.

The paper tackles the trade-off between performance and robustness in pre-trained language models for code comprehension by proposing SPACE, which finds worst-case semantic-preserving attacks and forces correct predictions, resulting in improved robustness against state-of-the-art attacks and boosted performance.

Based on the tremendous success of pre-trained language models (PrLMs) for source code comprehension tasks, current literature studies either ways to further improve the performance (generalization) of PrLMs, or their robustness against adversarial attacks. However, they have to compromise on the trade-off between the two aspects and none of them consider improving both sides in an effective and practical way. To fill this gap, we propose Semantic-Preserving Adversarial Code Embeddings (SPACE) to find the worst-case semantic-preserving attacks while forcing the model to predict the correct labels under these worst cases. Experiments and analysis demonstrate that SPACE can stay robust against state-of-the-art attacks while boosting the performance of PrLMs for code.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes