CYDec 22, 2023Code
Joining Forces for Pathology Diagnostics with AI Assistance: The EMPAIA InitiativeNorman Zerbe, Lars Ole Schwen, Christian Geißler et al.
Over the past decade, artificial intelligence (AI) methods in pathology have advanced substantially. However, integration into routine clinical practice has been slow due to numerous challenges, including technical and regulatory hurdles in translating research results into clinical diagnostic products and the lack of standardized interfaces. The open and vendor-neutral EMPAIA initiative addresses these challenges. Here, we provide an overview of EMPAIA's achievements and lessons learned. EMPAIA integrates various stakeholders of the pathology AI ecosystem, i.e., pathologists, computer scientists, and industry. In close collaboration, we developed technical interoperability standards, recommendations for AI testing and product development, and explainability methods. We implemented the modular and open-source EMPAIA platform and successfully integrated 14 AI-based image analysis apps from 8 different vendors, demonstrating how different apps can use a single standardized interface. We prioritized requirements and evaluated the use of AI in real clinical settings with 14 different pathology laboratories in Europe and Asia. In addition to technical developments, we created a forum for all stakeholders to share information and experiences on digital pathology and AI. Commercial, clinical, and academic stakeholders can now adopt EMPAIA's common open-source interfaces, providing a unique opportunity for large-scale standardization and streamlining of processes. Further efforts are needed to effectively and broadly establish AI assistance in routine laboratory use. To this end, a sustainable infrastructure, the non-profit association EMPAIA International, has been established to continue standardization and support broad implementation and advocacy for an AI-assisted digital pathology future.
CRFeb 12, 2016Code
Control-Flow Integrity: Precision, Security, and PerformanceNathan Burow, Scott A. Carr, Joseph Nash et al.
Memory corruption errors in C/C++ programs remain the most common source of security vulnerabilities in today's systems. Control-flow hijacking attacks exploit memory corruption vulnerabilities to divert program execution away from the intended control flow. Researchers have spent more than a decade studying and refining defenses based on Control-Flow Integrity (CFI), and this technique is now integrated into several production compilers. However, so far no study has systematically compared the various proposed CFI mechanisms, nor is there any protocol on how to compare such mechanisms. We compare a broad range of CFI mechanisms using a unified nomenclature based on (i) a qualitative discussion of the conceptual security guarantees, (ii) a quantitative security evaluation, and (iii) an empirical evaluation of their performance in the same test environment. For each mechanism, we evaluate (i) protected types of control-flow transfers, (ii) the precision of the protection for forward and backward edges. For open-source compiler-based implementations, we additionally evaluate (iii) the generated equivalence classes and target sets, and (iv) the runtime performance.
45.7CRMay 8
Deterministic Fully-Static Whole-Binary Translation without HeuristicsHongyu Chen, James McGowan, Michael Franz
We present Elevator, the first binary translator that statically translates entire x86-64 executables to AArch64 without debug information, source code, or assumptions about code layout. Unlike existing systems, which rely on heuristics or runtime fallbacks to handle code-versus-data decoding errors, Elevator considers all possible interpretations of every byte and produces a separate translation for each feasible one ahead of time. Any byte may be interpreted as data, an opcode, or an opcode argument; we generate separate control flow paths for all interpretations, pruning only those leading to abnormal termination. Translations are built by composing code "tiles" automatically derived from a high-level description of the source ISA, yielding a nimble translation framework. The approach is deterministic and produces complete, self-contained binaries with no runtime component in the trusted code base. The principal cost is substantial code size expansion. The key benefit is that the output is the actual code that will run, enabling testing, validation, certification, and cryptographic signing prior to deployment, reducing risk compared to emulators or JIT compilers. We evaluate Elevator on a diverse corpus of real-world binaries, including the entire SPECint 2006 suite, demonstrating that static full-program binary translation can be both reliable and practical. Elevator achieves performance on par with or better than QEMU's user-mode JIT emulation.
CRNov 4, 2020
dMVX: Secure and Efficient Multi-Variant Execution in a Distributed SettingAlexios Voulimeneas, Dokyung Song, Per Larsen et al.
Multi-variant execution (MVX) systems amplify the effectiveness of software diversity techniques. The key idea is to run multiple diversified program variants in lockstep while providing them with the same input and monitoring their run-time behavior for divergences. Thus, adversaries have to compromise all program variants simultaneously to mount an attack successfully. Recent work proposed distributed, heterogeneous MVX systems that leverage different ABIs and ISAs to increase the diversity between program variants further. However, existing distributed MVX system designs suffer from high performance overhead due to time-consuming network transactions for the MVX system's operations. This paper presents dMVX, a novel hybrid distributed MVX design, which incorporates new techniques that significantly reduce the overhead of MVX systems in a distributed setting. Our key insight is that we can intelligently reduce the MVX operations that use expensive network transfers. First, we can limit the monitoring of system calls that are not security-critical. Second, we observe that, in many circumstances, we can also safely cache or avoid replication operations needed for I/O related system calls. Our evaluation shows that dMVX reduces the performance degradation from over 50% to 3.1% for realistic server benchmarks.
CRDec 10, 2019
V0LTpwn: Attacking x86 Processor Integrity from SoftwareZijo Kenjar, Tommaso Frassetto, David Gens et al.
Fault-injection attacks have been proven in the past to be a reliable way of bypassing hardware-based security measures, such as cryptographic hashes, privilege and access permission enforcement, and trusted execution environments. However, traditional fault-injection attacks require physical presence, and hence, were often considered out of scope in many real-world adversary settings. In this paper we show this assumption may no longer be justified. We present V0LTpwn, a novel hardware-oriented but software-controlled attack that affects the integrity of computation in virtually any execution mode on modern x86 processors. To the best of our knowledge, this represents the first attack on x86 integrity from software. The key idea behind our attack is to undervolt a physical core to force non-recoverable hardware faults. Under a V0LTpwn attack, CPU instructions will continue to execute with erroneous results and without crashes, allowing for exploitation. In contrast to recently presented side-channel attacks that leverage vulnerable speculative execution, V0LTpwn is not limited to information disclosure, but allows adversaries to affect execution, and hence, effectively breaks the integrity goals of modern x86 platforms. In our detailed evaluation we successfully launch software-based attacks against Intel SGX enclaves from a privileged process to demonstrate that a V0LTpwn attack can successfully change the results of computations within enclave execution across multiple CPU revisions.
CRMar 8, 2019
DMON: A Distributed Heterogeneous N-Variant SystemAlexios Voulimeneas, Dokyung Song, Fabian Parzefall et al.
N-Variant Execution (NVX) systems utilize software diversity techniques for enhancing software security. The general idea is to run multiple different variants of the same program alongside each other while monitoring their run-time behavior. If the internal disparity between the running variants causes observable differences in response to malicious inputs, the monitor can detect such divergences in execution and then raise an alert and/or terminate execution. Existing NVX systems execute multiple, artificially diversified program variants on a single host. This paper presents a novel, distributed NVX design that executes program variants across multiple heterogeneous host computers; our prototype implementation combines an x86-64 host with an ARMv8 host. Our approach greatly increases the level of "internal different-ness" between the simultaneously running variants that can be supported, encompassing different instruction sets, endianness, calling conventions, system call interfaces, and potentially also differences in hardware security features. A major challenge to building such a heterogeneous distributed NVX system is performance. We present solutions to some of the main performance challenges. We evaluate our prototype system implementing these ideas to show that it can provide reasonable performance on a wide range of realistic workloads.
CRJun 12, 2018
SoK: Sanitizing for SecurityDokyung Song, Julian Lettner, Prabhu Rajasekaran et al.
The C and C++ programming languages are notoriously insecure yet remain indispensable. Developers therefore resort to a multi-pronged approach to find security issues before adversaries. These include manual, static, and dynamic program analysis. Dynamic bug finding tools --- henceforth "sanitizers" --- can find bugs that elude other types of analysis because they observe the actual execution of a program, and can therefore directly observe incorrect program behavior as it happens. A vast number of sanitizers have been prototyped by academics and refined by practitioners. We provide a systematic overview of sanitizers with an emphasis on their role in finding security issues. Specifically, we taxonomize the available tools and the security vulnerabilities they cover, describe their performance and compatibility properties, and highlight various trade-offs.
CRNov 22, 2017
PartiSan: Fast and Flexible Sanitization via Run-time PartitioningJulian Lettner, Dokyung Song, Taemin Park et al.
Sanitizers can detect security vulnerabilities in C/C++ code that elude static analysis. Current practice is to continuously fuzz and sanitize internal pre-release builds. Sanitization-enabled builds are rarely released publicly. This is in large part due to the high memory and processing requirements of sanitizers. We present PartiSan, a run-time partitioning technique that speeds up sanitizers and allows them to be used in a more flexible manner. Our core idea is to partition the execution into sanitized slices that incur a run-time overhead, and unsanitized slices running at full speed. With PartiSan, sanitization is no longer an all-or-nothing proposition. A single build can be distributed to every user regardless of their willingness to enable sanitization and the capabilities of their host system. PartiSan can automatically adjust the amount of sanitization to fit within a performance budget or disable sanitization if the host lacks sufficient resources. The flexibility afforded by run-time partitioning also means that we can alternate between different types of sanitizers dynamically; today, developers have to pick a single type of sanitizer ahead of time. Finally, we show that run-time partitioning can speed up fuzzing by running the sanitized partition only when the fuzzer discovers an input that causes a crash or uncovers new execution paths.
CRSep 27, 2014
Similarity-based matching meets Malware DiversityMathias Payer, Stephen Crane, Per Larsen et al.
Similarity metrics, e.g., signatures as used by anti-virus products, are the dominant technique to detect if a given binary is malware. The underlying assumption of this approach is that all instances of a malware (or even malware family) will be similar to each other. Software diversification is a probabilistic technique that uses code and data randomization and expressiveness in the target instruction set to generate large amounts of functionally equivalent but different binaries. Malware diversity builds on software diversity and ensures that any two diversified instances of the same malware have low similarity (according to a set of similarity metrics). An LLVM-based prototype implementation diversifies both code and data of binaries and our evaluation shows that signatures based on similarity only match one or few instances in a pool of diversified binaries generated from the same source code.