CRMar 20, 2023
Adversarial Attacks against Binary Similarity SystemsGianluca Capozzi, Daniele Cono D'Elia, Giuseppe Antonio Di Luna et al.
In recent years, binary analysis gained traction as a fundamental approach to inspect software and guarantee its security. Due to the exponential increase of devices running software, much research is now moving towards new autonomous solutions based on deep learning models, as they have been showing state-of-the-art performances in solving binary analysis problems. One of the hot topics in this context is binary similarity, which consists in determining if two functions in assembly code are compiled from the same source code. However, it is unclear how deep learning models for binary similarity behave in an adversarial context. In this paper, we study the resilience of binary similarity models against adversarial examples, showing that they are susceptible to both targeted and untargeted attacks (w.r.t. similarity goals) performed by black-box and white-box attackers. In more detail, we extensively test three current state-of-the-art solutions for binary similarity against two black-box greedy attacks, including a new technique that we call Spatial Greedy, and one white-box attack in which we repurpose a gradient-guided strategy used in attacks to image classifiers.
DCMar 30
Efficient Counting and Simulation in Content-Oblivious RingsJérémie Chalopin, Yi-Jun Chang, Giuseppe Antonio Di Luna et al.
In the content-oblivious (CO) model (proposed by Censor-Hillel et al.), processes inhabit an asynchronous network and communicate only by exchanging pulses. A series of works has clarified the computational power of this model. In particular, it was shown that, when a leader is present and the network is 2-edge-connected, content-oblivious communication can simulate classical asynchronous message passing. Subsequent results extended this equivalence to leaderless oriented and unoriented rings, and, under non-uniform assumptions, to general 2-edge-connected networks. The simulator of Censor-Hillel et al. requires $O(n^3b+n^3\log n)$ pulses to emulate the send of a single $b$-bit message, making it impractical even on modest-size networks. We focus on message-efficient computation in CO networks. We study the fundamental problem of counting in ring topologies, both because knowing the exact network size is a basic prerequisite for many distributed tasks and because counting immediately implies a broad class of aggregation primitives. We give an algorithm that counts using $O(n^{1.5})$ pulses in anonymous rings with a leader, an $O(n\log^2 n)$ algorithm for counting in rings with IDs. Moreover, we show that any counting algorithm in CO requires $Ω(n\log n)$ pulses. Interestingly, in the course of this investigation, we design a simulator for classic message passing: in one simulated round, each process can send a $b$-bit message to each of its neighbors using only $O(b)$ pulses per process. The simulator extends to general 2-edge-connected networks, after a pre-processing step that requires $O(n^{8}\log n)$ pulses, where $n$ is the number of processes, allowing thus efficient simulation of asynchronous message passing in general 2-edge-connected networks.
CRDec 5, 2024
On the Lack of Robustness of Binary Function Similarity SystemsGianluca Capozzi, Tong Tang, Jie Wan et al.
Binary function similarity, which often relies on learning-based algorithms to identify what functions in a pool are most similar to a given query function, is a sought-after topic in different communities, including machine learning, software engineering, and security. Its importance stems from the impact it has in facilitating several crucial tasks, from reverse engineering and malware analysis to automated vulnerability detection. Whereas recent work cast light around performance on this long-studied problem, the research landscape remains largely lackluster in understanding the resiliency of the state-of-the-art machine learning models against adversarial attacks. As security requires to reason about adversaries, in this work we assess the robustness of such models through a simple yet effective black-box greedy attack, which modifies the topology and the content of the control flow of the attacked functions. We demonstrate that this attack is successful in compromising all the models, achieving average attack success rates of 57.06% and 95.81% depending on the problem settings (targeted and untargeted attacks). Our findings are insightful: top performance on clean data does not necessarily relate to top robustness properties, which explicitly highlights performance-robustness trade-offs one should consider when deploying such models, calling for further research.
SENov 27, 2020
Who is Debugging the Debuggers? Exposing Debug Information Bugs in Optimized BinariesGiuseppe Antonio Di Luna, Davide Italiano, Luca Massarelli et al.
Despite the advancements in software testing, bugs still plague deployed software and result in crashes in production. When debugging issues -- sometimes caused by "heisenbugs" -- there is the need to interpret core dumps and reproduce the issue offline on the same binary deployed. This requires the entire toolchain (compiler, linker, debugger) to correctly generate and use debug information. Little attention has been devoted to checking that such information is correctly preserved by modern toolchains' optimization stages. This is particularly important as managing debug information in optimized production binaries is non-trivial, often leading to toolchain bugs that may hinder post-deployment debugging efforts. In this paper, we present Debug$^{2}$, a framework to find debug information bugs in modern toolchains. Our framework feeds random source programs to the target toolchain and surgically compares the debugging behavior of their optimized/unoptimized binary variants. Such differential analysis allows Debug$^{2}$ to check invariants at each debugging step and detect bugs from invariant violations. Our invariants are based on the (in)consistency of common debug entities, such as source lines, stack frames, and function arguments. We show that, while simple, this strategy yields powerful cross-toolchain and cross-language invariants, which can pinpoint several bugs in modern toolchains. We have used Debug$^{2}$ to find 23 bugs in the LLVM toolchain (clang/lldb), 8 bugs in the GNU toolchain (GCC/gdb), and 3 in the Rust toolchain (rustc/lldb) -- with 14 bugs already fixed by the developers.
LGDec 17, 2019
In Nomine Function: Naming Functions in Stripped Binaries with Neural NetworksFiorella Artuso, Giuseppe Antonio Di Luna, Luca Massarelli et al.
In this paper we investigate the problem of automatically naming pieces of assembly code. Where by naming we mean assigning to an assembly function a string of words that would likely be assigned by a human reverse engineer. We formally and precisely define the framework in which our investigation takes place. That is we define the problem, we provide reasonable justifications for the choices that we made for the design of training and the tests. We performed an analysis on a large real-world corpora constituted by nearly 9 millions of functions taken from more than 22k softwares. In such framework we test baselines coming from the field of Natural Language Processing (e.g., Seq2Seq networks and Transformer). Interestingly, our evaluation shows promising results beating the state-of-the-art and reaching good performance. We investigate the applicability of tine-tuning (i.e., taking a model already trained on a large generic corpora and retraining it for a specific task). Such technique is popular and well-known in the NLP field. Our results confirm that fine-tuning is effective even when neural networks are applied to binaries. We show that a model, pre-trained on the aforementioned corpora, when fine-tuned has higher performances on specific domains (such as predicting names in system utilites, malware, etc).
CRNov 13, 2018
SAFE: Self-Attentive Function Embeddings for Binary SimilarityLuca Massarelli, Giuseppe Antonio Di Luna, Fabio Petroni et al.
The binary similarity problem consists in determining if two functions are similar by only considering their compiled form. Advanced techniques for binary similarity recently gained momentum as they can be applied in several fields, such as copyright disputes, malware analysis, vulnerability detection, etc., and thus have an immediate practical impact. Current solutions compare functions by first transforming their binary code in multi-dimensional vector representations (embeddings), and then comparing vectors through simple and efficient geometric operations. However, embeddings are usually derived from binary code using manual feature extraction, that may fail in considering important function characteristics, or may consider features that are not important for the binary similarity problem. In this paper we propose SAFE, a novel architecture for the embedding of functions based on a self-attentive neural network. SAFE works directly on disassembled binary functions, does not require manual feature extraction, is computationally more efficient than existing solutions (i.e., it does not incur in the computational overhead of building or manipulating control flow graphs), and is more general as it works on stripped binaries and on multiple architectures. We report the results from a quantitative and qualitative analysis that show how SAFE provides a noticeable performance improvement with respect to previous solutions. Furthermore, we show how clusters of our embedding vectors are closely related to the semantic of the implemented algorithms, paving the way for further interesting applications (e.g. semantic-based binary function search).
LGOct 23, 2018
Unsupervised Features Extraction for Binary Similarity Using Graph Embedding Neural NetworksRoberto Baldoni, Giuseppe Antonio Di Luna, Luca Massarelli et al.
In this paper we consider the binary similarity problem that consists in determining if two binary functions are similar only considering their compiled form. This problem is know to be crucial in several application scenarios, such as copyright disputes, malware analysis, vulnerability detection, etc. The current state-of-the-art solutions in this field work by creating an embedding model that maps binary functions into vectors in $\mathbb{R}^{n}$. Such embedding model captures syntactic and semantic similarity between binaries, i.e., similar binary functions are mapped to points that are close in the vector space. This strategy has many advantages, one of them is the possibility to precompute embeddings of several binary functions, and then compare them with simple geometric operations (e.g., dot product). In [32] functions are first transformed in Annotated Control Flow Graphs (ACFGs) constituted by manually engineered features and then graphs are embedded into vectors using a deep neural network architecture. In this paper we propose and test several ways to compute annotated control flow graphs that use unsupervised approaches for feature learning, without incurring a human bias. Our methods are inspired after techniques used in the natural language processing community (e.g., we use word2vec to encode assembly instructions). We show that our approach is indeed successful, and it leads to better performance than previous state-of-the-art solutions. Furthermore, we report on a qualitative analysis of functions embeddings. We found interesting cases in which embeddings are clustered according to the semantic of the original binary function.