SE AIJun 11, 2022

CodeS: Towards Code Model Generalization Under Distribution Shift

Qiang Hu, Yuejun Guo, Xiaofei Xie, Maxime Cordy, Lei Ma, Mike Papadakis, Yves Le Traon

arXiv:2206.05480v211.713 citationsh-index: 66Has Code

Originality Synthesis-oriented

AI Analysis

This addresses the challenge of reliable deployment of code models under distribution shifts for software engineering and AI researchers, but it is incremental as it focuses on benchmarking and analysis rather than a new solution.

The paper tackled the problem of distribution shift in deep learning models for source code analysis by introducing CodeS, a benchmark dataset for two programming languages and five shift types, revealing that existing out-of-distribution detectors fail for source code and that all models suffer from shifts, with pre-trained bimodal models showing more resistance.

Distribution shift has been a longstanding challenge for the reliable deployment of deep learning (DL) models due to unexpected accuracy degradation. Although DL has been becoming a driving force for large-scale source code analysis in the big code era, limited progress has been made on distribution shift analysis and benchmarking for source code tasks. To fill this gap, this paper initiates to propose CodeS, a distribution shift benchmark dataset, for source code learning. Specifically, CodeS supports two programming languages (Java and Python) and five shift types (task, programmer, time-stamp, token, and concrete syntax tree). Extensive experiments based on CodeS reveal that 1) out-of-distribution detectors from other domains (e.g., computer vision) do not generalize to source code, 2) all code classification models suffer from distribution shifts, 3) representation-based shifts have a higher impact on the model than others, and 4) pre-trained bimodal models are relatively more resistant to distribution shifts.

View on arXiv PDF Code

Similar