SESep 2, 2024
Reward Augmentation in Reinforcement Learning for Testing Distributed SystemsAndrea Borgarelli, Constantin Enea, Rupak Majumdar et al.
Bugs in popular distributed protocol implementations have been the source of many downtimes in popular internet services. We describe a randomized testing approach for distributed protocol implementations based on reinforcement learning. Since the natural reward structure is very sparse, the key to successful exploration in reinforcement learning is reward augmentation. We show two different techniques that build on one another. First, we provide a decaying exploration bonus based on the discovery of new states -- the reward decays as the same state is visited multiple times. The exploration bonus captures the intuition from coverage-guided fuzzing of prioritizing new coverage points; in contrast to other schemes, we show that taking the maximum of the bonus and the Q-value leads to more effective exploration. Second, we provide waypoints to the algorithm as a sequence of predicates that capture interesting semantic scenarios. Waypoints exploit designer insight about the protocol and guide the exploration to ``interesting'' parts of the state space. Our reward structure ensures that new episodes can reliably get to deep interesting states even without execution caching. We have implemented our algorithm in Go. Our evaluation on three large benchmarks (RedisRaft, Etcd, and RSL) shows that our algorithm can significantly outperform baseline approaches in terms of coverage and bug finding.
56.9PLMay 13
On the Complexity of Checking Soundness of Natural Reductions (Extended Version)Constantin Enea, Azadeh Farzan, Dominik Klumpp
The verification of reductions, representative subsets of interleavings, simplifies correctness proofs of parameterized concurrent programs. We introduce an expressive class of syntactic reductions, which we call natural reductions. Natural reductions are specified by introducing atomic blocks and global rendezvous points in the parameterized program's thread template. We study the problem of deciding whether a given natural reduction is sound wrt. a given (semi-)commutativity relation. In the case that there is no synchronization between threads, we present a sound and complete polynomial-time algorithm. In the case where synchronization is considered, we provide a general lower bound for the problem (parametric in the synchronization mechanism), and show that the problem is coNP-hard already for a simple mechanism like locking.
SEJun 28, 2017Code
Exposing Non-Atomic Methods of Concurrent ObjectsMichael Emmi, Constantin Enea
Multithreaded software is typically built with specialized concurrent objects like atomic integers, queues, and maps. These objects' methods are designed to behave according to certain consistency criteria like atomicity, despite being optimized to avoid blocking and exploit parallelism, e.g., by using atomic machine instructions like compare and exchange (cmpxchg). Exposing atomicity violations is important since they generally lead to elusive bugs that are difficult to identify, reproduce, and ultimately repair. In this work we expose atomicity violations in concurrent object implementations from the most widely-used software development kit: The Java Development Kit (JDK). We witness atomicity violations via simple test harnesses containing few concurrent method invocations. While stress testing is effective at exposing violations given catalytic test harnesses and lightweight means of falsifying atomicity, divining effectual catalysts can be difficult, and atomicity checks are generally cumbersome. We overcome these problems by automating test-harness search, and establishing atomicity via membership in precomputed sets of acceptable return-value outcomes. Our approach enables testing millions of executions of each harness each second (per processor core). This scale is important since atomicity violations are observed in very few executions (tens to hundreds out of millions) of very few harnesses (one out of hundreds to thousands). Our implementation is open source and publicly available.
SEJul 10, 2018
Datalog-based Scalable Semantic Diffing of Concurrent ProgramsChungha Sung, Shuvendu Lahiri, Constantin Enea et al.
When an evolving program is modified to address issues related to thread synchronization, there is a need to confirm the change is correct, i.e., it does not introduce unexpected behavior. However, manually comparing two programs to identify the semantic difference is labor intensive and error prone, whereas techniques based on model checking are computationally expensive. To fill the gap, we develop a fast and approximate static analysis for computing synchronization differences of two programs. The method is fast because, instead of relying on heavy-weight model checking techniques, it leverages a polynomial-time Datalog-based program analysis framework to compute differentiating data-flow edges, i.e., edges allowed by one program but not the other. Although approximation is used our method is sufficiently accurate due to careful design of the Datalog inference rules and iterative increase of the required data-flow edges for representing a difference. We have implemented our method and evaluated it on a large number of multithreaded C programs to confirm its ability to produce, often within seconds, the same differences obtained by human; in contrast, prior techniques based on model checking take minutes or even hours and thus can be 10x to 1000x slower.
LOJul 20, 2015
On Automated Lemma Generation for Separation Logic with Inductive DefinitionsConstantin Enea, Mihaela Sighireanu, Zhilin Wu
Separation Logic with inductive definitions is a well-known approach for deductive verification of programs that manipulate dynamic data structures. Deciding verification conditions in this context is usually based on user-provided lemmas relating the inductive definitions. We propose a novel approach for generating these lemmas automatically which is based on simple syntactic criteria and deterministic strategies for applying them. Our approach focuses on iterative programs, although it can be applied to recursive programs as well, and specifications that describe not only the shape of the data structures, but also their content or their size. Empirically, we find that our approach is powerful enough to deal with sophisticated benchmarks, e.g., iterative procedures for searching, inserting, or deleting elements in sorted lists, binary search tress, red-black trees, and AVL trees, in a very efficient way.