Ruotong Yu

3papers

122citations

Novelty42%

AI Score26

Ranked #167,581 of 201,326 authors (top 83%)#4,965 in CR (top 68%)

3 Papers

CRJul 28, 2020Code

SoK: All You Ever Wanted to Know About x86/x64 Binary Disassembly But Were Afraid to Ask

Chengbin Pang, Ruotong Yu, Yaohui Chen et al.

Disassembly of binary code is hard, but necessary for improving the security of binary software. Over the past few decades, research in binary disassembly has produced many tools and frameworks, which have been made available to researchers and security professionals. These tools employ a variety of strategies that grant them different characteristics. The lack of systematization, however, impedes new research in the area and makes selecting the right tool hard, as we do not understand the strengths and weaknesses of existing tools. In this paper, we systematize binary disassembly through the study of nine popular, open-source tools. We couple the manual examination of their code bases with the most comprehensive experimental evaluation (thus far) using 3,788 binaries. Our study yields a comprehensive description and organization of strategies for disassembly, classifying them as either algorithm or else heuristic. Meanwhile, we measure and report the impact of individual algorithms on the results of each tool. We find that while principled algorithms are used by all tools, they still heavily rely on heuristics to increase code coverage. Depending on the heuristics used, different coverage-vs-correctness trade-offs come in play, leading to tools with different strengths and weaknesses. We envision that these findings will help users pick the right tool and assist researchers in improving binary disassembly.

CRApr 7, 2021

Towards Optimal Use of Exception Handling Information for Function Detection

Chengbin Pang, Ruotong Yu, Dongpeng Xu et al.

Function entry detection is critical for security of binary code. Conventional methods heavily rely on patterns, inevitably missing true functions and introducing errors. Recently, call frames have been used in exception-handling for function start detection. However, existing methods have two problems. First, they combine call frames with heuristic-based approaches, which often brings error and uncertain benefits. Second, they trust the fidelity of call frames, without handling the errors that are introduced by call frames. In this paper, we first study the coverage and accuracy of existing approaches in detecting function starts using call frames. We found that recursive disassembly with call frames can maximize coverage, and using extra heuristic-based approaches does not improve coverage and actually hurts accuracy. Second, we unveil call-frame errors and develop the first approach to fix them, making their use more reliable.

CRJan 24, 2020

Privacy for All: Demystify Vulnerability Disparity of Differential Privacy against Membership Inference Attack

Bo Zhang, Ruotong Yu, Haipei Sun et al.

Machine learning algorithms, when applied to sensitive data, pose a potential threat to privacy. A growing body of prior work has demonstrated that membership inference attack (MIA) can disclose specific private information in the training data to an attacker. Meanwhile, the algorithmic fairness of machine learning has increasingly caught attention from both academia and industry. Algorithmic fairness ensures that the machine learning models do not discriminate a particular demographic group of individuals (e.g., black and female people). Given that MIA is indeed a learning model, it raises a serious concern if MIA ``fairly'' treats all groups of individuals equally. In other words, whether a particular group is more vulnerable against MIA than the other groups. This paper examines the algorithmic fairness issue in the context of MIA and its defenses. First, for fairness evaluation, it formalizes the notation of vulnerability disparity (VD) to quantify the difference of MIA treatment on different demographic groups. Second, it evaluates VD on four real-world datasets, and shows that VD indeed exists in these datasets. Third, it examines the impacts of differential privacy, as a defense mechanism of MIA, on VD. The results show that although DP brings significant change on VD, it cannot eliminate VD completely. Therefore, fourth, it designs a new mitigation algorithm named FAIRPICK to reduce VD. An extensive set of experimental results demonstrate that FAIRPICK can effectively reduce VD for both with and without the DP deployment.