SEAIIRDec 7, 2023

STraceBERT: Source Code Retrieval using Semantic Application Traces

arXiv:2312.04731v1h-index: 1ESEC/SIGSOFT FSE
Originality Incremental advance
AI Analysis

This addresses the problem of code retrieval in software reverse engineering for engineers and security analysts, but it appears incremental as it builds on existing BERT-style models and dynamic analysis techniques.

The paper tackles the challenge of software reverse engineering by introducing STraceBERT, which uses dynamic analysis to record Java library calls and pretrains a BERT-style model on these traces for method source code retrieval, showing effectiveness compared to existing approaches.

Software reverse engineering is an essential task in software engineering and security, but it can be a challenging process, especially for adversarial artifacts. To address this challenge, we present STraceBERT, a novel approach that utilizes a Java dynamic analysis tool to record calls to core Java libraries, and pretrain a BERT-style model on the recorded application traces for effective method source code retrieval from a candidate set. Our experiments demonstrate the effectiveness of STraceBERT in retrieving the source code compared to existing approaches. Our proposed approach offers a promising solution to the problem of code retrieval in software reverse engineering and opens up new avenues for further research in this area.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes