DSSep 14, 2014
A Framework for Automated Competitive Analysis of On-line Scheduling of Firm-Deadline TasksKrishnendu Chatterjee, Andreas Pavlogiannis, Alexander Kößler et al.
We present a flexible framework for the automated competitive analysis of on-line scheduling algorithms for firm-deadline real-time tasks based on multi-objective graphs: Given a taskset and an on-line scheduling algorithm specified as a labeled transition system, along with some optional safety, liveness, and/or limit-average constraints for the adversary, we automatically compute the competitive ratio of the algorithm w.r.t. a clairvoyant scheduler. We demonstrate the flexibility and power of our approach by comparing the competitive ratio of several on-line algorithms, including $D^{over}$, that have been proposed in the past, for various tasksets. Our experimental results reveal that none of these algorithms is universally optimal, in the sense that there are tasksets where other schedulers provide better performance. Our framework is hence a very useful design tool for selecting optimal algorithms for a given application.
94.5PLApr 15
On the Decidability of Verification under Release/AcquireGiovanna Kobus Conrado, Andreas Pavlogiannis
The verification of concurrent programs under weak-memory models is a burgeoning effort, owing to the increasing adoption of weak memory in concurrent software and hardware. Release/Acquire has become the standard model for high-performance concurrent programming, adopted by common mainstream languages and computer architectures. In a surprising result, Abdulla et al. (PLDI'19) proved that reachability in this model is undecidable when programs have access to atomic Read-Modify-Write (RMW) operations. Moreover, undecidability holds even for executions with just 4 contexts, and is thus immune to underapproximations based on bounded context switching. The canonical, RMW-free case was left as a challenging question, proving a non-primitive recursive lower bound as a first step, and has remained open for the past seven years. In this paper, we settle the above open question by proving that reachability is undecidable even in the RMW-free fragment of Release/Acquire, thereby characterizing the simplest set of communication primitives that gives rise to undecidability. Moreover, we prove that bounding both the number of context switches and the number of RMWs recovers decidability, thus fully characterizing the boundary of decidability along the dimensions of RMW-bounding and context-bounding.
29.0PLApr 13
Fast Atomicity MonitoringHünkar Can Tun, Yifan Dong, Andreas Pavlogiannis
Atomicity is a fundamental abstraction in concurrency, specifying that program behavior can be understood by considering specific code blocks executing atomically. However, atomicity invariants are tricky to maintain while also optimizing for code efficiency, and atomicity violations are a common root cause of many concurrency bugs. To address this problem, several dynamic techniques have been developed for testing whether a program execution adheres to an atomicity specification, most often instantiated as \emph{conflict serializability}. The efficiency of the analysis has been targeted in various papers, with the state-of-the-art algorithms \textsc{RegionTrack} and \textsc{Aerodrome} achieving a time complexity $O(nk^3)$ and $O(nk(k + v + \ell))$, respectively, for a trace $σ$ of $n$ events, $k$ threads, $v$ locations, and $\ell$ locks. In this paper we introduce \textsc{AtomSanitizer}, a new algorithm for testing conflict serializability, with time complexity $O(nk^2)$. \textsc{AtomSanitizer} operates in an efficient streaming style, is theoretically faster than all existing algorithms, and also has a smaller memory footprint. Moreover, \textsc{AtomSanitizer} is the first algorithm designed to incur minimal locking when deployed in a concurrent monitoring setting. Experiments on standard benchmarks indicate that \textsc{AtomSanitizer} is always faster in practice than all existing conflict-serializability testers. Finally, we also implement \textsc{AtomSanitizer} inside the TSAN framework, for monitoring atomicity in real time. Our experiments reveal that \textsc{AtomSanitizer} incurs minimal time and space overhead compared to the data-race detection engine of TSAN, and thus is the first algorithm for conflict serializability demonstrated to be suitable for a runtime monitoring setting.
LOJan 17, 2022
A Tree Clock Data Structure for Causal Orderings in Concurrent ExecutionsUmang Mathur, Andreas Pavlogiannis, Hünkar Can Tunç et al.
Dynamic techniques are a scalable and effective way to analyze concurrent programs. Instead of analyzing all behaviors of a program, these techniques detect errors by focusing on a single program execution. Often a crucial step in these techniques is to define a causal ordering between events in the execution, which is then computed using vector clocks, a simple data structure that stores logical times of threads. The two basic operations of vector clocks, namely join and copy, require $Θ(k)$ time, where $k$ is the number of threads. Thus they are a computational bottleneck when $k$ is large. In this work, we introduce tree clocks, a new data structure that replaces vector clocks for computing causal orderings in program executions. Joining and copying tree clocks takes time that is roughly proportional to the number of entries being modified, and hence the two operations do not suffer the a-priori $Θ(k)$ cost per application. We show that when used to compute the classic happens-before (HB) partial order, tree clocks are optimal, in the sense that no other data structure can lead to smaller asymptotic running time. Moreover, we demonstrate that tree clocks can be used to compute other partial orders, such as schedulable-happens-before (SHB) and the standard Mazurkiewicz (MAZ) partial order, and thus are a versatile data structure. Our experiments show that just by replacing vector clocks with tree clocks, the computation becomes from $2.02 \times$ faster (MAZ) to $2.66 \times$ (SHB) and $2.97 \times$ (HB) on average per benchmark. These results illustrate that tree clocks have the potential to become a standard data structure with wide applications in concurrent analyses.
DSJan 29, 2020
Optimal and Perfectly Parallel Algorithms for On-demand Data-flow AnalysisKrishnendu Chatterjee, Amir Kafshdar Goharshady, Rasmus Ibsen-Jensen et al.
Interprocedural data-flow analyses form an expressive and useful paradigm of numerous static analysis applications, such as live variables analysis, alias analysis and null pointers analysis. The most widely-used framework for interprocedural data-flow analysis is IFDS, which encompasses distributive data-flow functions over a finite domain. On-demand data-flow analyses restrict the focus of the analysis on specific program locations and data facts. This setting provides a natural split between (i) an offline (or preprocessing) phase, where the program is partially analyzed and analysis summaries are created, and (ii) an online (or query) phase, where analysis queries arrive on demand and the summaries are used to speed up answering queries. In this work, we consider on-demand IFDS analyses where the queries concern program locations of the same procedure (aka same-context queries). We exploit the fact that flow graphs of programs have low treewidth to develop faster algorithms that are space and time optimal for many common data-flow analyses, in both the preprocessing and the query phase. We also use treewidth to develop query solutions that are embarrassingly parallelizable, i.e. the total work for answering each query is split to a number of threads such that each thread performs only a constant amount of work. Finally, we implement a static analyzer based on our algorithms, and perform a series of on-demand analysis experiments on standard benchmarks. Our experimental results show a drastic speed-up of the queries after only a lightweight preprocessing phase, which significantly outperforms existing techniques.