Gilles Barthe

CR
17papers
1,739citations
Novelty54%
AI Score44

17 Papers

QUANT-PHApr 1
Traq: Estimating the Quantum Cost of Classical Programs

Anurudh Peduri, Jam Kabeer Ali Khan, Gilles Barthe et al.

Predicting practical speedups offered by future quantum computers has become a major focus of the quantum community. Typically, such predictions involve numerical simulations supported by lengthy manual analyses and are carried out for one specific algorithm at a time. In this work, we present Traq, a principled approach towards estimating the quantum speedup of classical programs fully automatically. It consists of a classical language that includes high-level primitives amenable to quantum speedups, a compilation to low-level quantum programs, and a source-level cost analysis with provable guarantees. Our cost analysis upper bounds the complexity of the resulting quantum program and is sensitive to the input data of the program (in addition to providing worst-case costs). Traq is implemented as a Haskell package with an extensive evaluation.

CRMay 12, 2021
SoK: Practical Foundations for Software Spectre Defenses

Sunjay Cauligi, Craig Disselkoen, Daniel Moghimi et al.

Spectre vulnerabilities violate our fundamental assumptions about architectural abstractions, allowing attackers to steal sensitive data despite previously state-of-the-art countermeasures. To defend against Spectre, developers of verification tools and compiler-based mitigations are forced to reason about microarchitectural details such as speculative execution. In order to aid developers with these attacks in a principled way, the research community has sought formal foundations for speculative execution upon which to rebuild provable security guarantees. This paper systematizes the community's current knowledge about software verification and mitigation for Spectre. We study state-of-the-art software defenses, both with and without associated formal models, and use a cohesive framework to compare the security properties each defense provides. We explore a wide variety of tradeoffs in the expressiveness of formal frameworks, the complexity of defense tools, and the resulting security guarantees. As a result of our analysis, we suggest practical choices for developers of analysis and mitigation tools, and we identify several open problems in this area to guide future work on grounded software defenses.

CRNov 16, 2020
Reconciling Security and Utility in Next-Generation Epidemic Risk Mitigation Systems

Pierfrancesco Ingo, Nichole Boufford, Ming Cheng Jiang et al.

Epidemics like the recent COVID-19 require proactive contact tracing and epidemiological analysis to predict and subsequently contain infection transmissions. The proactive measures require large scale data collection, which simultaneously raise concerns regarding users' privacy. Digital contact tracing systems developed in response to COVID-19 either collected extensive data for effective analytics at the cost of users' privacy or collected minimal data for the sake of user privacy but were ineffective in predicting and mitigating the epidemic risks. We present Silmarillion--in preparation for future epidemics--a system that reconciles user's privacy with rich data collection for higher utility. In Silmarillion, user devices record Bluetooth encounters with beacons installed in strategic locations. The beacons further enrich the encounters with geo-location, location type, and environment conditions at the beacon installation site. This enriched information enables detailed scientific analysis of disease parameters as well as more accurate personalized exposure risk notification. At the same time, Silmarillion provides privacy to all participants and non-participants at the same level as that guaranteed in digital and manual contact tracing. We describe the design of Silmarillion and its communication protocols that ensure user privacy and data security. We also evaluate a prototype of Silmarillion built using low-end IoT boards, showing that the power consumption and user latencies are adequately low for a practical deployment. Finally, we briefly report on a small-scale deployment within a university building as a proof-of-concept.

CRNov 12, 2020
Deciding Accuracy of Differential Privacy Schemes

Gilles Barthe, Rohit Chadha, Paul Krogmeier et al.

Differential privacy is a mathematical framework for developing statistical computations with provable guarantees of privacy and accuracy. In contrast to the privacy component of differential privacy, which has a clear mathematical and intuitive meaning, the accuracy component of differential privacy does not have a generally accepted definition; accuracy claims of differential privacy algorithms vary from algorithm to algorithm and are not instantiations of a general definition. We identify program discontinuity as a common theme in existing \emph{ad hoc} definitions and introduce an alternative notion of accuracy parametrized by, what we call, {\distance} -- the {\distance} of an input $x$ w.r.t., a deterministic computation $f$ and a distance $d$, is the minimal distance $d(x,y)$ over all $y$ such that $f(y)\neq f(x)$. We show that our notion of accuracy subsumes the definition used in theoretical computer science, and captures known accuracy claims for differential privacy algorithms. In fact, our general notion of accuracy helps us prove better claims in some cases. Next, we study the decidability of accuracy. We first show that accuracy is in general undecidable. Then, we define a non-trivial class of probabilistic computations for which accuracy is decidable (unconditionally, or assuming Schanuel's conjecture). We implement our decision procedure and experimentally evaluate the effectiveness of our approach for generating proofs or counterexamples of accuracy for common algorithms from the literature.

LGOct 10, 2020
Scaling Guarantees for Nearest Counterfactual Explanations

Kiarash Mohammadi, Amir-Hossein Karimi, Gilles Barthe et al.

Counterfactual explanations (CFE) are being widely used to explain algorithmic decisions, especially in consequential decision-making contexts (e.g., loan approval or pretrial bail). In this context, CFEs aim to provide individuals affected by an algorithmic decision with the most similar individual (i.e., nearest individual) with a different outcome. However, while an increasing number of works propose algorithms to compute CFEs, such approaches either lack in optimality of distance (i.e., they do not return the nearest individual) and perfect coverage (i.e., they do not provide a CFE for all individuals); or they cannot handle complex models, such as neural networks. In this work, we provide a framework based on Mixed-Integer Programming (MIP) to compute nearest counterfactual explanations with provable guarantees and with runtimes comparable to gradient-based approaches. Our experiments on the Adult, COMPAS, and Credit datasets show that, in contrast with previous methods, our approach allows for efficiently computing diverse CFEs with both distance guarantees and perfect coverage.

LGOct 8, 2020
A survey of algorithmic recourse: definitions, formulations, solutions, and prospects

Amir-Hossein Karimi, Gilles Barthe, Bernhard Schölkopf et al.

Machine learning is increasingly used to inform decision-making in sensitive situations where decisions have consequential effects on individuals' lives. In these settings, in addition to requiring models to be accurate and robust, socially relevant values such as fairness, privacy, accountability, and explainability play an important role for the adoption and impact of said technologies. In this work, we focus on algorithmic recourse, which is concerned with providing explanations and recommendations to individuals who are unfavourably treated by automated decision-making systems. We first perform an extensive literature review, and align the efforts of many authors by presenting unified definitions, formulations, and solutions to recourse. Then, we provide an overview of the prospective research directions towards which the community may engage, challenging existing assumptions and making explicit connections to other ethical challenges such as security, privacy, and fairness.

CROct 9, 2019
Deciding Differential Privacy for Programs with Finite Inputs and Outputs

Gilles Barthe, Rohit Chadha, Vishal Jagannath et al.

Differential privacy is a de facto standard for statistical computations over databases that contain private data. The strength of differential privacy lies in a rigorous mathematical definition that guarantees individual privacy and yet allows for accurate statistical results. Thanks to its mathematical definition, differential privacy is also a natural target for formal analysis. A broad line of work uses logical methods for proving privacy. However, these methods are not complete, and only partially automated. A recent and complementary line of work uses statistical methods for finding privacy violations. However, the methods only provide statistical guarantees (but no proofs). We propose the first decision procedure for checking the differential privacy of a non-trivial class of probabilistic computations. Our procedure takes as input a program P parametrized by a privacy budget $ε$, and either proves differential privacy for all possible values of $ε$ or generates a counterexample. In addition, our procedure applies both to $ε$-differential privacy and $(ε,δ)$-differential privacy. Technically, the decision procedure is based on a novel and judicious encoding of the semantics of programs in our class into a decidable fragment of the first-order theory of the reals with exponentiation. We implement our procedure and use it for (dis)proving privacy bounds for many well-known examples, including randomized response, histogram, report noisy max and sparse vector.

CROct 3, 2019
Constant-Time Foundations for the New Spectre Era

Sunjay Cauligi, Craig Disselkoen, Klaus v. Gleissenthall et al.

The constant-time discipline is a software-based countermeasure used for protecting high assurance cryptographic implementations against timing side-channel attacks. Constant-time is effective (it protects against many known attacks), rigorous (it can be formalized using program semantics), and amenable to automated verification. Yet, the advent of micro-architectural attacks makes constant-time as it exists today far less useful. This paper lays foundations for constant-time programming in the presence of speculative and out-of-order execution. We present an operational semantics and a formal definition of constant-time programs in this extended setting. Our semantics eschews formalization of microarchitectural features (that are instead assumed under adversary control), and yields a notion of constant-time that retains the elegance and tractability of the usual notion. We demonstrate the relevance of our semantics in two ways: First, by contrasting existing Spectre-like attacks with our definition of constant-time. Second, by implementing a static analysis tool, Pitchfork, which detects violations of our extended constant-time property in real world cryptographic libraries.

LGMay 29, 2019
Privacy Amplification by Mixing and Diffusion Mechanisms

Borja Balle, Gilles Barthe, Marco Gaboardi et al.

A fundamental result in differential privacy states that the privacy guarantees of a mechanism are preserved by any post-processing of its output. In this paper we investigate under what conditions stochastic post-processing can amplify the privacy of a mechanism. By interpreting post-processing as the application of a Markov operator, we first give a series of amplification results in terms of uniform mixing properties of the Markov process defined by said operator. Next we provide amplification bounds in terms of coupling arguments which can be applied in cases where uniform mixing is not available. Finally, we introduce a new family of mechanisms based on diffusion processes which are closed under post-processing, and analyze their privacy via a novel heat flow argument. On the applied side, we generalize the analysis of "privacy amplification by iteration" in Noisy SGD and show it admits an exponential improvement in the strongly convex case, and study a mechanism based on the Ornstein-Uhlenbeck diffusion process which contains the Gaussian mechanism with optimal post-processing on bounded inputs as a special case.

LGMay 27, 2019
Model-Agnostic Counterfactual Explanations for Consequential Decisions

Amir-Hossein Karimi, Gilles Barthe, Borja Balle et al.

Predictive models are being increasingly used to support consequential decision making at the individual level in contexts such as pretrial bail and loan approval. As a result, there is increasing social and legal pressure to provide explanations that help the affected individuals not only to understand why a prediction was output, but also how to act to obtain a desired outcome. To this end, several works have proposed optimization-based methods to generate nearest counterfactual explanations. However, these methods are often restricted to a particular subset of models (e.g., decision trees or linear models) and differentiable distance functions. In contrast, we build on standard theory and tools from formal verification and propose a novel algorithm that solves a sequence of satisfiability problems, where both the distance function (objective) and predictive model (constraints) are represented as logic formulae. As shown by our experiments on real-world data, our algorithm is: i) model-agnostic ({non-}linear, {non-}differentiable, {non-}convex); ii) data-type-agnostic (heterogeneous features); iii) distance-agnostic ($\ell_0, \ell_1, \ell_\infty$, and combinations thereof); iv) able to generate plausible and diverse counterfactuals for any sample (i.e., 100% coverage); and v) at provably optimal distances.

LGMay 24, 2019
Hypothesis Testing Interpretations and Renyi Differential Privacy

Borja Balle, Gilles Barthe, Marco Gaboardi et al.

Differential privacy is a de facto standard in data privacy, with applications in the public and private sectors. A way to explain differential privacy, which is particularly appealing to statistician and social scientists is by means of its statistical hypothesis testing interpretation. Informally, one cannot effectively test whether a specific individual has contributed her data by observing the output of a private mechanism---any test cannot have both high significance and high power. In this paper, we identify some conditions under which a privacy definition given in terms of a statistical divergence satisfies a similar interpretation. These conditions are useful to analyze the distinguishability power of divergences and we use them to study the hypothesis testing interpretation of some relaxations of differential privacy based on Renyi divergence. This analysis also results in an improved conversion rule between these definitions and differential privacy.

CRApr 9, 2019
The Last Mile: High-Assurance and High-Speed Cryptographic Implementations

José Bacelar Almeida, Manuel Barbosa, Gilles Barthe et al.

We develop a new approach for building cryptographic implementations. Our approach goes the last mile and delivers assembly code that is provably functionally correct, protected against side-channels, and as efficient as hand-written assembly. We illustrate ur approach using ChaCha20-Poly1305, one of the mandatory ciphersuites in TLS 1.3, and deliver formally verified vectorized implementations which outperform the fastest non-verified code. We realize our approach by combining the Jasmin framework, which offers in a single language features of high-level and low-level programming, and the EasyCrypt proof assistant, which offers a versatile verification infrastructure that supports proofs of functional correctness and equivalence checking. Neither of these tools had been used for functional correctness before. Taken together, these infrastructures empower programmers to develop efficient and verified implementations by "game hopping", starting from reference implementations that are proved functionally correct against a specification, and gradually introducing program optimizations that are proved correct by equivalence checking. We also make several contributions of independent interest, including a new and extensible verified compiler for Jasmin, with a richer memory model and support for vectorized instructions, and a new embedding of Jasmin in EasyCrypt.

LGJul 4, 2018
Privacy Amplification by Subsampling: Tight Analyses via Couplings and Divergences

Borja Balle, Gilles Barthe, Marco Gaboardi

Differential privacy comes equipped with multiple analytical tools for the design of private data analyses. One important tool is the so-called "privacy amplification by subsampling" principle, which ensures that a differentially private mechanism run on a random subsample of a population provides higher privacy guarantees than when run on the entire population. Several instances of this principle have been studied for different random subsampling methods, each with an ad-hoc analysis. In this paper we present a general method that recovers and improves prior analyses, yields lower bounds and derives new instances of privacy amplification by subsampling. Our method leverages a characterization of differential privacy as a divergence which emerged in the program verification community. Furthermore, it introduces new tools, including advanced joint convexity and privacy profiles, which might be of independent interest.

LOMar 27, 2018
Facets of Software Doping

Gilles Barthe, Pedro R. D'Argenio, Bernd Finkbeiner et al.

This paper provides an informal discussion of the formal aspects of software doping.

CRJan 24, 2017
Reasoning about Probabilistic Defense Mechanisms against Remote Attacks

Martín Ochoa, Sebastian Banescu, Cynthia Disenfeld et al.

Despite numerous countermeasures proposed by practitioners and researchers, remote control-flow alteration of programs with memory-safety vulnerabilities continues to be a realistic threat. Guaranteeing that complex software is completely free of memory-safety vulnerabilities is extremely expensive. Probabilistic countermeasures that depend on random secret keys are interesting, because they are an inexpensive way to raise the bar for attackers who aim to exploit memory-safety vulnerabilities. Moreover, some countermeasures even support legacy systems. However, it is unclear how to quantify and compare the effectiveness of different probabilistic countermeasures or combinations of such countermeasures. In this paper we propose a methodology to rigorously derive security bounds for probabilistic countermeasures. We argue that by representing security notions in this setting as events in probabilistic games, similarly as done with cryptographic security definitions, concrete and asymptotic guarantees can be obtained against realistic attackers. These guarantees shed light on the effectiveness of single countermeasures and their composition and allow practitioners to more precisely gauge the risk of an attack.

LOJan 19, 2016
Proving Differential Privacy via Probabilistic Couplings

Gilles Barthe, Marco Gaboardi, Benjamin Grégoire et al.

In this paper, we develop compositional methods for formally verifying differential privacy for algorithms whose analysis goes beyond the composition theorem. Our methods are based on the observation that differential privacy has deep connections with a generalization of probabilistic couplings, an established mathematical tool for reasoning about stochastic processes. Even when the composition theorem is not helpful, we can often prove privacy by a coupling argument. We demonstrate our methods on two algorithms: the Exponential mechanism and the Above Threshold algorithm, the critical component of the famous Sparse Vector algorithm. We verify these examples in a relational program logic apRHL+, which can construct approximate couplings. This logic extends the existing apRHL logic with more general rules for the Laplace mechanism and the one-sided Laplace mechanism, and new structural rules enabling pointwise reasoning about privacy; all the rules are inspired by the connection with coupling. While our paper is presented from a formal verification perspective, we believe that its main insight is of independent interest for the differential privacy community.

LOJul 10, 2014
Proving differential privacy in Hoare logic

Gilles Barthe, Marco Gaboardi, Emilio Jesús Gallego Arias et al.

Differential privacy is a rigorous, worst-case notion of privacy-preserving computation. Informally, a probabilistic program is differentially private if the participation of a single individual in the input database has a limited effect on the program's distribution on outputs. More technically, differential privacy is a quantitative 2-safety property that bounds the distance between the output distributions of a probabilistic program on adjacent inputs. Like many 2-safety properties, differential privacy lies outside the scope of traditional verification techniques. Existing approaches to enforce privacy are based on intricate, non-conventional type systems, or customized relational logics. These approaches are difficult to implement and often cumbersome to use. We present an alternative approach that verifies differential privacy by standard, non-relational reasoning on non-probabilistic programs. Our approach transforms a probabilistic program into a non-probabilistic program which simulates two executions of the original program. We prove that if the target program is correct with respect to a Hoare specification, then the original probabilistic program is differentially private. We provide a variety of examples from the differential privacy literature to demonstrate the utility of our approach. Finally, we compare our approach with existing verification techniques for privacy.