SEMar 27
Etna: An Evaluation Platform for Property-Based TestingAlperen Keles, Jessica Shi, Nikhil Kamath et al.
Property-based testing is a mainstay of functional programming, boasting a rich literature, an enthusiastic user community, and an abundance of tools~ -- so many, indeed, that new users may have difficulty choosing. Moreover, any given framework may support a variety of strategies for generating test inputs; even experienced users may wonder which are better in any given situation. Sadly, the PBT literature, though long on creativity, is short on rigorous comparisons to help answer such questions. We present ETNA, a platform for empirical evaluation and comparison of PBT techniques. ETNA incorporates a number of popular PBT frameworks and testing workloads from the literature, and its extensible architecture makes adding new ones easy, while handling the technical drudgery of performance measurement. To illustrate its benefits, we use ETNA to carry out several experiments with popular PBT approaches in Rocq, Haskell, OCaml, Racket, and Rust, allowing users to more clearly understand best practices and tradeoffs.
PLApr 14
The Search for Constrained Random GeneratorsHarrison Goldstein, Hila Peleg, Cassia Torczon et al.
Among the biggest challenges in property-based testing (PBT) is the constrained random generation problem: given a predicate on program values, randomly sample from the set of all values satisfying that predicate, and only those values. Efficient solutions to this problem are critical, since the executable specifications used by PBT often have preconditions that input values must satisfy in order to be valid test cases, and satisfying values are often sparsely distributed. We propose a novel approach to this problem using ideas from deductive program synthesis. We present a set of synthesis rules, based on a denotational semantics of generators, that give rise to an automatic procedure for synthesizing correct generators. Our system handles recursive predicates by rewriting them as catamorphisms and then matching with appropriate anamorphisms; this is theoretically simpler than other approaches to synthesis for recursive functions, yet still extremely expressive. Our implementation, Palamedes, is an extensible library for the Lean theorem prover. The synthesis algorithm itself is built on standard proof-search tactics, reducing implementation burden and allowing the algorithm to benefit from further advances in Lean proof automation.
HCApr 6
Decision-Oriented Programming with AporiaSaketh Ram Kasibatla, Raven Rothkopf, Hila Peleg et al.
AI agents allow developers to express computational intent abstractly, reducing cognitive effort and helping achieve flow during programming. Increased abstraction, however, comes at a cost: developers cede decision-making authority to agents, often without realizing that important design decisions are being made without them. We aim to bring these decisions to the foreground in a paradigm we dub decision-oriented programming. In DOP, (1) decisions are explicit and structured, serving as the shared medium between the programmer and the agent; (2) decisions are co-authored interactively, with the agent proactively eliciting them from the programmer; and (3) each decision is traceable to code. As a step towards this vision, we have built Aporia, a design probe that tracks decisions in a persistent, editable Decision Bank; elicits them by asking programmers design questions; and encodes each decision as an executable test suite that can be used to validate the implementation. In a user study of 14 programmers, Aporia increased engagement in the design process and scaffolded both exploration and validation. Participants also gained a more accurate understanding of their implementations, with their mental models 5x less likely to disagree with the code than a baseline coding agent.
PLApr 7
Fail Faster: Staging and Fast Randomness for High-Performance PBTCynthia Richey, Joseph W. Cutler, Harrison Goldstein et al.
Property-based testing (PBT) relies on generators for random test cases, often constructed using embedded domain specific languages, which provide expressive combinators for building and composing generators. The effectiveness of PBT depends critically on the speed of these generators. However, careful measurements show that the generator performance of widely used PBT libraries falls well short of what is possible, due principally to (1) the abstraction overhead of their combinator-heavy style and (2) suboptimal sources of randomness. We characterize, quantify, and address these bottlenecks. To eliminate abstraction overheads, we propose a technique based on multi-stage programming, dubbed Allegro. We apply this technique to leading generator libraries in OCaml and Scala 3, significantly improving performance. To quantify the performance impact of the randomness source, we carry out a controlled experiment, replacing the randomness in the OCaml PBT library with an optimized version. Both interventions exactly preserve the semantics of generators, enabling precise, pointwise comparisons. Together, these improvements find bugs up to $13\times$ faster.
PLMay 2, 2021
Formalizing Stack Safety as a Security PropertySean Noble Anderson, Roberto Blanco, Leonidas Lampropoulos et al.
The term stack safety is used to describe a variety of compiler, run-time, and hardware mechanisms for protecting stack memory. Unlike "the heap," the ISA-level stack does not correspond to a single high-level language concept: different compilers use it in different ways to support procedural and functional abstraction mechanisms from a wide range of languages. This protean nature makes it difficult to nail down what it means to correctly enforce stack safety. We propose a new formal characterization of stack safety using concepts from language-based security. Rather than treating stack safety as a monolithic property, we decompose it into an integrity property and a confidentiality property for each of the caller and the callee, plus a control-flow property: five properties in all. This formulation is motivated by a particular class of enforcement mechanisms, the "lazy" stack safety micro-policies studied by Roessler and DeHon, which permit functions to write into one another's frames but taint the changed locations so that the frame's owner cannot access them. No existing characterization of stack safety captures this style of safety; we capture it here by stating our properties in terms of the observable behavior of the system. Our properties go further than previous formal definitions of stack safety, supporting caller- and callee-saved registers, arguments passed on the stack, and tail-call elimination. We validate the properties by using them to distinguish between correct and incorrect implementations of Roessler and DeHon's micro-policies using property-based random testing. Our test harness successfully identifies several broken variants, including Roessler and DeHon's lazy policy; a repaired version of their policy passes our tests.
SEJan 31, 2021
Model-Based Testing of Networked ApplicationsYishuai Li, Benjamin C. Pierce, Steve Zdancewic
We present a principled automatic testing framework for application-layer protocols. The key innovation is a domain-specific embedded language for writing nondeterministic models of the behavior of networked servers. These models are defined within the Coq interactive theorem prover, supporting a smooth transition from testing to formal verification. Given a server model, we show how to automatically derive a tester that probes the server for unexpected behaviors. We address the uncertainties caused by both the server's internal choices and the network delaying messages nondeterministically. The derived tester accepts server implementations whose possible behaviors are a subset of those allowed by the nondeterministic model. We demonstrate the effectiveness of this framework by using it to specify and test a fragment of the HTTP/1.1 protocol, showing that the automatically derived tester can capture RFC violations in buggy server implementations, including the latest versions of Apache and Nginx.
CROct 8, 2020
Testing Differential Privacy with Dual InterpretersHengchu Zhang, Edo Roth, Andreas Haeberlen et al.
Applying differential privacy at scale requires convenient ways to check that programs computing with sensitive data appropriately preserve privacy. We propose here a fully automated framework for {\em testing} differential privacy, adapting a well-known "pointwise" technique from informal proofs of differential privacy. Our framework, called DPCheck, requires no programmer annotations, handles all previously verified or tested algorithms, and is the first fully automated framework to distinguish correct and buggy implementations of PrivTree, a probabilistically terminating algorithm that has not previously been mechanically checked. We analyze the probability of DPCheck mistakenly accepting a non-private program and prove that, theoretically, the probability of false acceptance can be made exponentially small by suitable choice of test size. We demonstrate DPCheck's utility empirically by implementing all benchmark algorithms from prior work on mechanical verification of differential privacy, plus several others and their incorrect variants, and show DPCheck accepts the correct implementations and rejects the incorrect variants. We also demonstrate how DPCheck can be deployed in a practical workflow to test differentially privacy for the 2020 US Census Disclosure Avoidance System (DAS).
CRFeb 2, 2018
When Good Components Go Bad: Formally Secure Compilation Despite Dynamic CompromiseCarmine Abate, Arthur Azevedo de Amorim, Roberto Blanco et al.
We propose a new formal criterion for evaluating secure compilation schemes for unsafe languages, expressing end-to-end security guarantees for software components that may become compromised after encountering undefined behavior---for example, by accessing an array out of bounds. Our criterion is the first to model dynamic compromise in a system of mutually distrustful components with clearly specified privileges. It articulates how each component should be protected from all the others---in particular, from components that have encountered undefined behavior and become compromised. Each component receives secure compilation guarantees---in particular, its internal invariants are protected from compromised components---up to the point when this component itself becomes compromised, after which we assume an attacker can take complete control and use this component's privileges to attack other components. More precisely, a secure compilation chain must ensure that a dynamically compromised component cannot break the safety properties of the system at the target level any more than an arbitrary attacker-controlled component (with the same interface and privileges, but without undefined behaviors) already could at the source level. To illustrate the model, we construct a secure compilation chain for a small unsafe language with buffers, procedures, and components, targeting a simple abstract machine with built-in compartmentalization. We give a machine-checked proof in Coq that this compiler satisfies our secure compilation criterion. Finally, we show that the protection guarantees offered by the compartmentalized abstract machine can be achieved at the machine-code level using either software fault isolation or a tag-based reference monitor.
CROct 19, 2017
Formally Secure Compilation of Unsafe Low-Level Components (Extended Abstract)Guglielmo Fachini, Catalin Hritcu, Marco Stronati et al.
We propose a new formal criterion for secure compilation, providing strong security guarantees for components written in unsafe, low-level languages with C-style undefined behavior. Our criterion goes beyond recent proposals, which protect the trace properties of a single component against an adversarial context, to model dynamic compromise in a system of mutually distrustful components. Each component is protected from all the others until it receives an input that triggers an undefined behavior, causing it to become compromised and attack the remaining uncompromised components. To illustrate this model, we demonstrate a secure compilation chain for an unsafe language with buffers, procedures, and components, compiled to a simple RISC abstract machine with built-in compartmentalization. The protection guarantees offered by this abstract machine can be achieved at the machine-code level using either software fault isolation or tag-based reference monitoring. We are working on machine-checked proofs showing that this compiler satisfies our secure compilation criterion.
PLMay 20, 2017
The Meaning of Memory SafetyArthur Azevedo de Amorim, Catalin Hritcu, Benjamin C. Pierce
We give a rigorous characterization of what it means for a programming language to be memory safe, capturing the intuition that memory safety supports local reasoning about state. We formalize this principle in two ways. First, we show how a small memory-safe language validates a noninterference property: a program can neither affect nor be affected by unreachable parts of the state. Second, we extend separation logic, a proof system for heap-manipulating programs, with a memory-safe variant of its frame rule. The new rule is stronger because it applies even when parts of the program are buggy or malicious, but also weaker because it demands a stricter form of separation between parts of the program state. We also consider a number of pragmatically motivated variations on memory safety and the reasoning principles they support. As an application of our characterization, we evaluate the security of a previously proposed dynamic monitor for memory safety of heap-allocated data.
CRFeb 14, 2016
Beyond Good and Evil: Formalizing the Security Guarantees of Compartmentalizing CompilationYannis Juglaret, Catalin Hritcu, Arthur Azevedo de Amorim et al.
Compartmentalization is good security-engineering practice. By breaking a large software system into mutually distrustful components that run with minimal privileges, restricting their interactions to conform to well-defined interfaces, we can limit the damage caused by low-level attacks such as control-flow hijacking. When used to defend against such attacks, compartmentalization is often implemented cooperatively by a compiler and a low-level compartmentalization mechanism. However, the formal guarantees provided by such compartmentalizing compilation have seen surprisingly little investigation. We propose a new security property, secure compartmentalizing compilation (SCC), that formally characterizes the guarantees provided by compartmentalizing compilation and clarifies its attacker model. We reconstruct our property by starting from the well-established notion of fully abstract compilation, then identifying and lifting three important limitations that make standard full abstraction unsuitable for compartmentalization. The connection to full abstraction allows us to prove SCC by adapting established proof techniques; we illustrate this with a compiler from a simple unsafe imperative language with procedures to a compartmentalized abstract machine.
PLOct 2, 2015
Towards a Fully Abstract Compiler Using Micro-Policies: Secure Compilation for Mutually Distrustful ComponentsYannis Juglaret, Catalin Hritcu, Arthur Azevedo de Amorim et al.
Secure compilation prevents all low-level attacks on compiled code and allows for sound reasoning about security in the source language. In this work we propose a new attacker model for secure compilation that extends the well-known notion of full abstraction to ensure protection for mutually distrustful components. We devise a compiler chain (compiler, linker, and loader) and a novel security monitor that together defend against this strong attacker model. The monitor is implemented using a recently proposed, generic tag-based protection framework called micro-policies, which comes with hardware support for efficient caching and with a formal verification methodology. Our monitor protects the abstractions of a simple object-oriented language---class isolation, the method call discipline, and type safety---against arbitrary low-level attackers.