SECLMay 23, 2022

AdaptivePaste: Code Adaptation through Learning Semantics-aware Variable Usage Representations

CambridgeMicrosoft
arXiv:2205.11023v33 citationsh-index: 38
Originality Incremental advance
AI Analysis

This addresses a practical problem for software developers by automating code adaptation to improve productivity and reduce bugs, though it is incremental as it builds on existing learning-based methods.

The paper tackles the code adaptation task for adapting variable identifiers in pasted code snippets to surrounding source code, achieving 79.8% accuracy in Python and reducing manual adaptation time by nearly half in a user study.

In software development, it is common for programmers to copy-paste or port code snippets and then adapt them to their use case. This scenario motivates the code adaptation task -- a variant of program repair which aims to adapt variable identifiers in a pasted snippet of code to the surrounding, preexisting source code. However, no existing approach has been shown to effectively address this task. In this paper, we introduce AdaptivePaste, a learning-based approach to source code adaptation, based on transformers and a dedicated dataflow-aware deobfuscation pre-training task to learn meaningful representations of variable usage patterns. We evaluate AdaptivePaste on a dataset of code snippets in Python. Results suggest that our model can learn to adapt source code with 79.8% accuracy. To evaluate how valuable is AdaptivePaste in practice, we perform a user study with 10 Python developers on a hundred real-world copy-paste instances. The results show that AdaptivePaste reduces the dwell time to nearly half the time it takes for manual code adaptation, and helps to avoid bugs. In addition, we utilize the participant feedback to identify potential avenues for improvement of AdaptivePaste.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes