FuncFooler: A Practical Black-box Attack Against Learning-based Binary Code Similarity Detection Methods
This work addresses a critical security problem for binary analysis tools, but it is incremental as it builds on existing attack methods in adversarial machine learning.
The paper tackles the adversarial vulnerability of learning-based binary code similarity detection methods by designing FuncFooler, a black-box attack algorithm that successfully compromises three models (SAFE, Asm2Vec, jTrans), raising concerns about their reliability in security applications.
The binary code similarity detection (BCSD) method measures the similarity of two binary executable codes. Recently, the learning-based BCSD methods have achieved great success, outperforming traditional BCSD in detection accuracy and efficiency. However, the existing studies are rather sparse on the adversarial vulnerability of the learning-based BCSD methods, which cause hazards in security-related applications. To evaluate the adversarial robustness, this paper designs an efficient and black-box adversarial code generation algorithm, namely, FuncFooler. FuncFooler constrains the adversarial codes 1) to keep unchanged the program's control flow graph (CFG), and 2) to preserve the same semantic meaning. Specifically, FuncFooler consecutively 1) determines vulnerable candidates in the malicious code, 2) chooses and inserts the adversarial instructions from the benign code, and 3) corrects the semantic side effect of the adversarial code to meet the constraints. Empirically, our FuncFooler can successfully attack the three learning-based BCSD models, including SAFE, Asm2Vec, and jTrans, which calls into question whether the learning-based BCSD is desirable.