SEDBMar 23, 2021

RPT: Effective and Efficient Retrieval of Program Translations from Big Code

arXiv:2103.12797v15 citations
Originality Incremental advance
AI Analysis

This addresses the need for automated program translation in software engineering, offering a novel retrieval-based approach that is more efficient than manual or data-intensive methods.

The paper tackles the problem of automating program translation by retrieving existing translations from Big Code, presenting RPT, a system that achieves efficient cross-language code retrieval without requiring parallel datasets.

Program translation is a growing demand in software engineering. Manual program translation requires programming expertise in source and target language. One way to automate this process is to make use of the big data of programs, i.e., Big Code. In particular, one can search for program translations in Big Code. However, existing code retrieval techniques are not designed for cross-language code retrieval. Other data-driven approaches require human efforts in constructing cross-language parallel datasets to train translation models. In this paper, we present RPT, a novel code translation retrieval system. We propose a lightweight but informative program representation, which can be generalized to all imperative PLs. Furthermore, we present our index structure and hierarchical filtering mechanism for efficient code retrieval from a Big Code database.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes