CRJul 20, 2020
Confidential Attestation: Efficient in-Enclave Verification of Privacy Policy ComplianceWeijie Liu, Wenhao Wang, Xiaofeng Wang et al.
A trusted execution environment (TEE) such as Intel Software Guard Extension (SGX) runs a remote attestation to prove to a data owner the integrity of the initial state of an enclave, including the program to operate on her data. For this purpose, the data-processing program is supposed to be open to the owner, so its functionality can be evaluated before trust can be established. However, increasingly there are application scenarios in which the program itself needs to be protected. So its compliance with privacy policies as expected by the data owner should be verified without exposing its code. To this end, this paper presents CAT, a new model for TEE-based confidential attestation. Our model is inspired by Proof-Carrying Code, where a code generator produces proof together with the code and a code consumer verifies the proof against the code on its compliance with security policies. Given that the conventional solutions do not work well under the resource-limited and TCB-frugal TEE, we propose a new design that allows an untrusted out-enclave generator to analyze the source code of a program when compiling it into binary and a trusted in-enclave consumer efficiently verifies the correctness of the instrumentation and the presence of other protection before running the binary. Our design strategically moves most of the workload to the code generator, which is responsible for producing well-formatted and easy-to-check code, while keeping the consumer simple. Also, the whole consumer can be made public and verified through a conventional attestation. We implemented this model on Intel SGX and demonstrate that it introduces a very small part of TCB. We also thoroughly evaluated its performance on micro- and macro- benchmarks and real-world applications, showing that the new design only incurs a small overhead when enforcing several categories of security policies.
CRFeb 18, 2020
Profile-Guided, Multi-Version Binary RewritingXiaozhu Meng, Buddhika Chamith, Ryan Newton
The static instrumentation of machine code, also known as binary rewriting, is a power technique, but suffers from high runtime overhead compared to compiler-level instrumentation. Recent research has shown that tools can achieve near-to-zero overhead when rewriting binaries (excluding the overhead from the application specific instrumentation). However, the users of binary rewriting tools often have difficulties in understanding why their instrumentation is slow and how to optimize their instrumentation. We are inspired by a traditional program optimization workflow, where one can profile the program execution to identify performance hot spots, modify the source code or apply suitable compiler optimizations, and even apply profile-guided optimization. We present profile-guided, Multi-Version Binary Rewriting to enable this optimization workflow for static binary instrumentation. Our new techniques include three components. First, we augment existing binary rewriting to support call path profiling; one can interactively view instrumentation costs and understand the calling contexts where the costs incur. Second, we present Versioned Structure Binary Editing, which is a general binary transformation technique. Third, we use call path profiles to guide the application of binary transformation. We apply our new techniques to shadow stack and basic block code coverage. Our instrumentation optimization workflow helps us identify several opportunities with regard to code transformation and instrumentation data layout. Our evaluation on SPEC CPU 2017 shows that the geometric overhead of shadow stack and block coverage is reduced from 7.6% and 161.3% to 1.4% and 4.0%, respectively. We also achieve promising results on Apache HTTP Server, where the shadow stack overhead is reduced from about 20% to 3.5%.
CRSep 21, 2018
Adversarial Binaries for Authorship IdentificationXiaozhu Meng, Barton P. Miller, Somesh Jha
Binary code authorship identification determines authors of a binary program. Existing techniques have used supervised machine learning for this task. In this paper, we look this problem from an attacker's perspective. We aim to modify a test binary, such that it not only causes misprediction but also maintains the functionality of the original input binary. Attacks against binary code are intrinsically more difficult than attacks against domains such as computer vision, where attackers can change each pixel of the input image independently and still maintain a valid image. For binary code, even flipping one bit of a binary may cause the binary to be invalid, to crash at the run-time, or to lose the original functionality. We investigate two types of attacks: untargeted attacks, causing misprediction to any of the incorrect authors, and targeted attacks, causing misprediction to a specific one among the incorrect authors. We develop two key attack capabilities: feature vector modification, generating an adversarial feature vector that both corresponds to a real binary and causes the required misprediction, and input binary modification, modifying the input binary to match the adversarial feature vector while maintaining the functionality of the input binary. We evaluated our attack against classifiers trained with a state-of-the-art method for authorship attribution. The classifiers for authorship identification have 91% accuracy on average. Our untargeted attack has a 96% success rate on average, showing that we can effectively suppress authorship signal. Our targeted attack has a 46% success rate on average, showing that it is possible, but significantly more difficult to impersonate a specific programmer's style. Our attack reveals that existing binary code authorship identification techniques rely on code features that are easy to modify, and thus are vulnerable to attacks.