Sébastien Bardin

CR
13papers
267citations
Novelty56%
AI Score43

13 Papers

21.6PLApr 27
Hybrid Path-Sums for Hybrid Quantum Programs

Christophe Chareton, Jad Issa, Mathieu Nguyen et al.

As quantum computing becomes an emerging reality, designing efficient quantum programming capabilities is becoming more and more important. Particularly, the debugging and validation of quantum programs is of paramount importance, as these programs are by definition hard to test. Static analysis and formal verification methods for quantum programs started to emerge a few years now, yet they often miss hybrid quantum/classical reasoning facilities with, e.g., generic quantum control, classical control and classical computation instructions. In this paper, we lay out the foundations of a framework for the automated formal verification of (full) hybrid quantum programs featuring both classical and quantum control, measurement and hybrid data structures. In particular, we propose: (1) a novel symbolic representation for describing and manipulating sets of hybrid quantum/classical states called Hybrid Path-Sums (HPS); (2) a set of rewriting rules providing a rich mechanism for simplifying and reasoning on these symbolic hybrid states, and (3) a core assertion language to specify equivalence of hybrid quantum programs, the satisfaction of properties on (parts of) hybrid states, and the extraction of probabilistic statements about the program behavior. We prove the correctness of the novel symbolic representation, of its rewriting system and of the specification system. Finally, we propose a full implementation of this framework as a dedicated symbolic execution engine for hybrid programs. We present an evaluation of a set of representative hybrid case-studies from the literature, showcasing the advantage of our approach and its efficiency compared to state-of-the-art solutions.

CRFeb 9, 2021
AI-based Blackbox Code Deobfuscation: Understand, Improve and Mitigate

Grégoire Menguy, Sébastien Bardin, Richard Bonichon et al.

Code obfuscation aims at protecting Intellectual Property and other secrets embedded into software from being retrieved. Recent works leverage advances in artificial intelligence with the hope of getting blackbox deobfuscators completely immune to standard (whitebox) protection mechanisms. While promising, this new field of AI-based blackbox deobfuscation is still in its infancy. In this article we deepen the state of AI-based blackbox deobfuscation in three key directions: understand the current state-of-the-art, improve over it and design dedicated protection mechanisms. In particular, we define a novel generic framework for AI-based blackbox deobfuscation encompassing prior work and highlighting key components; we are the first to point out that the search space underlying code deobfuscation is too unstable for simulation-based methods (e.g., Monte Carlo Tres Search used in prior work) and advocate the use of robust methods such as S-metaheuritics; we propose the new optimized AI-based blackbox deobfuscator Xyntia which significantly outperforms prior work in terms of success rate (especially with small time budget) while being completely immune to the most recent anti-analysis code obfuscation methods; and finally we propose two novel protections against AI-based blackbox deobfuscation, allowing to counter Xyntia's powerful attacks.

CRNov 30, 2020
No Crash, No Exploit: Automated Verification of Embedded Kernels

Olivier Nicole, Matthieu Lemerre, Sébastien Bardin et al.

The kernel is the most safety- and security-critical component of many computer systems, as the most severe bugs lead to complete system crash or exploit. It is thus desirable to guarantee that a kernel is free from these bugs using formal methods, but the high cost and expertise required to do so are deterrent to wide applicability. We propose a method that can verify both absence of runtime errors (i.e. crashes) and absence of privilege escalation (i.e. exploits) in embedded kernels from their binary executables. The method can verify the kernel runtime independently from the application, at the expense of only a few lines of simple annotations. When given a specific application, the method can verify simple kernels without any human intervention. We demonstrate our method on two different use cases: we use our tool to help the development of a new embedded real-time kernel, and we verify an existing industrial real-time kernel executable with no modification. Results show that the method is fast, simple to use, and can prevent real errors and security vulnerabilities.

CRMar 19, 2020
Automatically Proving Microkernels Free from Privilege Escalation from their Executable

Olivier Nicole, Matthieu Lemerre, Sébastien Bardin et al.

Operating system kernels are the security keystone of most computer systems, as they provide the core protection mechanisms. Kernels are in particular responsible for their own security, i.e. they must prevent untrusted user tasks from reaching their level of privilege. We demonstrate that proving such absence of privilege escalation is a pre-requisite for any definitive security proof of the kernel. While prior OS kernel formal verifications were performed either on source code or crafted kernels, with manual or semi-automated methods requiring significant human efforts in annotations or proofs, we show that it is possible to compute such kernel security proofs using fully-automated methods and starting from the executable code of an existing microkernel with no modification, thus formally verifying absence of privilege escalation with high confidence for a low cost. We applied our method on two embedded microkernels, including the industrial kernel AnonymOS: with only 58 lines of annotation and less than 10 minutes of computation, our method finds a vulnerability in a first (buggy) version of AnonymOS and verifies absence of privilege escalation in a second (secure) version.

CRFeb 25, 2020
Binary-level Directed Fuzzing for Use-After-Free Vulnerabilities

Manh-Dung Nguyen, Sébastien Bardin, Richard Bonichon et al.

Directed fuzzing focuses on automatically testing specific parts of the code by taking advantage of additional information such as (partial) bug stack trace, patches or risky operations. Key applications include bug reproduction, patch testing and static analysis report verification. Although directed fuzzing has received a lot of attention recently, hard-to-detect vulnerabilities such as Use-After-Free (UAF) are still not well addressed, especially at the binary level. We propose UAFuzz, the first (binary-level) directed greybox fuzzer dedicated to UAF bugs. The technique features a fuzzing engine tailored to UAF specifics, a lightweight code instrumentation and an efficient bug triage step. Experimental evaluation for bug reproduction on real cases demonstrates that UAFuzz significantly outperforms state-of-the-art directed fuzzers in terms of fault detection rate, time to exposure and bug triaging. UAFuzz has also been proven effective in patch testing, leading to the discovery of 30 new bugs (7 CVEs) in programs such as Perl, GPAC and GNU Patch. Finally, we provide to the community a large fuzzing benchmark dedicated to UAF, built on both real codes and real bugs.

CRDec 18, 2019
Binsec/Rel: Efficient Relational Symbolic Execution for Constant-Time at Binary-Level

Lesly-Ann Daniel, Sébastien Bardin, Tamara Rezk

The constant-time programming discipline (CT) is an efficient countermeasure against timing side-channel attacks, requiring the control flow and the memory accesses to be independent from the secrets. Yet, writing CT code is challenging as it demands to reason about pairs of execution traces (2- hypersafety property) and it is generally not preserved by the compiler, requiring binary-level analysis. Unfortunately, current verification tools for CT either reason at higher level (C or LLVM), or sacrifice bug-finding or bounded-verification, or do not scale. We tackle the problem of designing an efficient binary-level verification tool for CT providing both bug-finding and bounded-verification. The technique builds on relational symbolic execution enhanced with new optimizations dedicated to information flow and binary-level analysis, yielding a dramatic improvement over prior work based on symbolic execution. We implement a prototype, Binsec/Rel, and perform extensive experiments on a set of 338 cryptographic implementations, demonstrating the benefits of our approach in both bug-finding and bounded-verification. Using Binsec/Rel, we also automate a previous manual study of CT preservation by compilers. Interestingly, we discovered that gcc -O0 and backend passes of clang introduce violations of CT in implementations that were previously deemed secure by a state-of-the-art CT verification tool operating at LLVM level, showing the importance of reasoning at binary-level.

CRAug 5, 2019
How to Kill Symbolic Deobfuscation for Free; or Unleashing the Potential of Path-Oriented Protections

Mathilde Ollivier, Sébastien Bardin, Richard Bonichon et al.

Code obfuscation is a major tool for protecting software intellectual property from attacks such as reverse engineering or code tampering. Yet, recently proposed (automated) attacks based on Dynamic Symbolic Execution (DSE) shows very promising results, hence threatening software integrity. Current defenses are not fully satisfactory, being either not efficient against symbolic reasoning, or affecting runtime performance too much, or being too easy to spot. We present and study a new class of anti-DSE protections coined as path-oriented protections targeting the weakest spot of DSE, namely path exploration. We propose a lightweight, efficient, resistant and analytically proved class of obfuscation algorithms designed to hinder DSE-based attacks. Extensive evaluation demonstrates that these approaches critically counter symbolic deobfuscation while yielding only a very slight overhead.

SEDec 28, 2017
Abstract Interpretation using a Language of Symbolic Approximation

Matthieu Lemerre, Sébastien Bardin

The traditional abstract domain framework for imperative programs suffers from several shortcomings; in particular it does not allow precise symbolic abstractions. To solve these problems, we propose a new abstract interpretation framework, based on symbolic expressions used both as an abstraction of the program, and as the input analyzed by abstract domains. We demonstrate new applications of the frame- work: an abstract domain that efficiently propagates constraints across the whole program; a new formalization of functor domains as approximate translation, which allows the production of approximate programs, on which we can perform classical symbolic techniques. We used these to build a complete analyzer for embedded C programs, that demonstrates the practical applicability of the framework.

SEAug 29, 2017
Freeing Testers from Polluting Test Objectives

Michaël Marcozzi, Sébastien Bardin, Nikolai Kosmatov et al.

Testing is the primary approach for detecting software defects. A major challenge faced by testers lies in crafting efficient test suites, able to detect a maximum number of bugs with manageable effort. To do so, they rely on coverage criteria, which define some precise test objectives to be covered. However, many common criteria specify a significant number of objectives that occur to be infeasible or redundant in practice, like covering dead code or semantically equal mutants. Such objectives are well-known to be harmful to the design of test suites, impacting both the efficiency and precision of testers' effort. This work introduces a sound and scalable formal technique able to prune out a significant part of the infeasible and redundant objectives produced by a large panel of white-box criteria. In a nutshell, we reduce this challenging problem to proving the validity of logical assertions in the code under test. This technique is implemented in a tool that relies on weakest-precondition calculus and SMT solving for proving the assertions. The tool is built on top of the Frama-C verification platform, which we carefully tune for our specific scalability needs. The experiments reveal that the tool can prune out up to 27% of test objectives in a program and scale to applications of 200K lines of code.

CRDec 16, 2016
Targeting Infeasibility Questions on Obfuscated Codes

Robin David, Sébastien Bardin, Jean-Yves Marion

Software deobfuscation is a crucial activity in security analysis and especially, in malware analysis. While standard static and dynamic approaches suffer from well-known shortcomings, Dynamic Symbolic Execution (DSE) has recently been proposed has an interesting alternative, more robust than static analysis and more complete than dynamic analysis. Yet, DSE addresses certain kinds of questions encountered by a reverser namely feasibility questions. Many issues arising during reverse, e.g. detecting protection schemes such as opaque predicates fall into the category of infeasibility questions. In this article, we present the Backward-Bounded DSE, a generic, precise, efficient and robust method for solving infeasibility questions. We demonstrate the benefit of the method for opaque predicates and call stack tampering, and give some insight for its usage for some other protection schemes. Especially, the technique has successfully been used on state-of-the-art packers as well as on the government-grade X-Tunnel malware -- allowing its entire deobfuscation. Backward-Bounded DSE does not supersede existing DSE approaches, but rather complements them by addressing infeasibility questions in a scalable and precise manner. Following this line, we propose sparse disassembly, a combination of Backward-Bounded DSE and static disassembly able to enlarge dynamic disassembly in a guaranteed way, hence getting the best of dynamic and static disassembly. This work paves the way for robust, efficient and precise disassembly tools for heavily-obfuscated binaries.

SESep 5, 2016
Generic and Effective Specification of Structural Test Objectives

Sébastien Bardin, Mickaël Delahaye, Nikolai Kosmatov et al.

While a wide range of different, sometimes heterogeneous test coverage criteria have been proposed, there exists no generic formalism to describe them, and available test automation tools usually support only a small subset of them. We introduce a unified specification language, called HTOL, providing a powerful generic mechanism to define test objectives, which permits encoding numerous existing criteria and supporting them in a unified way. HTOL comes with a formal semantics and can express complex requirements over several executions (using a novel notion of hyperlabels), as well as alternative requirements or requirements over a whole program execution. A novel classification of a large class of existing criteria is proposed. Finally, a coverage measurement tool for HTOL objectives has been implemented. Initial experiments suggest that the proposed approach is both efficient and practical.

LODec 1, 2013
A Combined Approach for Constraints over Finite Domains and Arrays

Sébastien Bardin, Arnaud Gotlieb

Arrays are ubiquitous in the context of software verification. However, effective reasoning over arrays is still rare in CP, as local reasoning is dramatically ill-conditioned for constraints over arrays. In this paper, we propose an approach combining both global symbolic reasoning and local consistency filtering in order to solve constraint systems involving arrays (with accesses, updates and size constraints) and finite-domain constraints over their elements and indexes. Our approach, named FDCC, is based on a combination of a congruence closure algorithm for the standard theory of arrays and a CP solver over finite domains. The tricky part of the work lies in the bi-directional communication mechanism between both solvers. We identify the significant information to share, and design ways to master the communication overhead. Experiments on random instances show that FDCC solves more formulas than any portfolio combination of the two solvers taken in isolation, while overhead is kept reasonable.

SEAug 19, 2013
Efficient Leverage of Symbolic ATG Tools to Advanced Coverage Criteria

Sébastien Bardin, Nikolai Kosmatov, François Cheynier

Automatic test data generation (ATG) is a major topic in software engineering. In this paper, we seek to bridge the gap between the coverage criteria supported by symbolic ATG tools and the most advanced coverage criteria found in the literature. We define a new testing criterion, label coverage, and prove it to be both expressive and amenable to efficient automation. We propose several innovative techniques resulting in an effective black-box support for label coverage, while a direct approach induces an exponential blow-up of the search space. Initial experiments show that ATG for label coverage can be achieved at a reasonable cost and that our optimisations yield very significant savings.