LG SEJun 22, 2021

On Adversarial Robustness of Synthetic Code Generation

Mrinal Anand, Pratik Kayal, Mayank Singh

arXiv:2106.11629v16.56 citations

Originality Incremental advance

AI Analysis

This work addresses adversarial vulnerabilities in code generation systems, which is an incremental improvement for developers and researchers in automated programming.

The paper tackles the problem of adversarial robustness in synthetic code generation for domain-specific languages, showing that Transformer-based models outperform existing baselines but perform poorly under adversarial settings, and proposes dataset augmentation techniques that reduce bias with demonstrated efficacy.

Automatic code synthesis from natural language descriptions is a challenging task. We witness massive progress in developing code generation systems for domain-specific languages (DSLs) employing sequence-to-sequence deep learning techniques in the recent past. In this paper, we specifically experiment with \textsc{AlgoLisp} DSL-based generative models and showcase the existence of significant dataset bias through different classes of adversarial examples. We also experiment with two variants of Transformer-based models that outperform all existing \textsc{AlgoLisp} DSL-based code generation baselines. Consistent with the current state-of-the-art systems, our proposed models, too, achieve poor performance under adversarial settings. Therefore, we propose several dataset augmentation techniques to reduce bias and showcase their efficacy using robust experimentation.

View on arXiv PDF

Similar