3.7SEMay 4
A Shallow Embedding of Datalog in LeanRamy Shahin
Datalog is a lightweight logic programming language, based on the logic of Horn clauses. Lean, on the other hand, is a proof assistant system and language based on the Calculus of Inductive Constructions (CIC). Datalog is more constrained and less expressive than Lean but has a long history of established deduction algorithms. Writing definitions and queries in the Datalog fragment of Lean would be more succinct and understandable than writing them in Lean itself. This paper outlines the design and implementation of a shallow embedding of Datalog as a Domain Specific Language (DSL) on top of Lean. Bidirectional interoperability between the Datalog DSL and Lean is a primary goal of this design. In addition to rules and facts, backward chaining queries are automatically translated into theorems with tactic-based proofs. The paper also includes three simple examples of how the DSL can be used.
SEJul 16, 2021
Applying Declarative Analysis to Software Product Line Models: An Industrial StudyRamy Shahin, Robert Hackman, Rafael Toledo et al.
Software Product Lines (SPLs) are families of related software products developed from a common set of artifacts. Most existing analysis tools can be applied to a single product at a time, but not to an entire SPL. Some tools have been redesigned/re-implemented to support the kind of variability exhibited in SPLs, but this usually takes a lot of effort, and is error-prone. Declarative analyses written in languages like Datalog have been collectively lifted to SPLs in prior work, which makes the process of applying an existing declarative analysis to a product line more straightforward. In this paper, we take an existing declarative analysis (behaviour alteration) written in the Grok declarative language, port it to Datalog, and apply it to a set of automotive software product lines from General Motors. We discuss the design of the analysis pipeline used in this process, present its scalability results, and provide a means to visualize the analysis results for a subset of products filtered by feature expression. We also reflect on some of the lessons learned throughout this project.
SEJun 17, 2021
Towards Assurance-Driven Architectural Decomposition of Software SystemsRamy Shahin
Computer systems are so complex, so they are usually designed and analyzed in terms of layers of abstraction. Complexity is still a challenge facing logical reasoning tools that are used to find software design flaws and implementation bugs. Abstraction is also a common technique for scaling those tools to more complex systems. However, the abstractions used in the design phase of systems are in many cases different from those used for assurance. In this paper we argue that different software quality assurance techniques operate on different aspects of software systems. To facilitate assurance, and for a smooth integration of assurance tools into the Software Development Lifecycle (SDLC), we present a 4-dimensional meta-architecture that separates computational, coordination, and stateful software artifacts early on in the design stage. We enumerate some of the design and assurance challenges that can be addressed by this meta-architecture, and demonstrate it on the high-level design of a simple file system.
SEApr 30, 2021
Towards Certified Analysis of Software Product Line Safety CasesRamy Shahin, Sahar Kokaly, Marsha Chechik
Safety-critical software systems are in many cases designed and implemented as families of products, usually referred to as Software Product Lines (SPLs). Products within an SPL vary from each other in terms of which features they include. Applying existing analysis techniques to SPLs and their safety cases is usually challenging because of the potentially exponential number of products with respect to the number of supported features. In this paper, we present a methodology and infrastructure for certified \emph{lifting} of existing single-product safety analyses to product lines. To ensure certified safety of our infrastructure, we implement it in an interactive theorem prover, including formal definitions, lemmas, correctness criteria theorems, and proofs. We apply this infrastructure to formalize and lift a Change Impact Assessment (CIA) algorithm. We present a formal definition of the lifted algorithm, outline its correctness proof (with the full machine-checked proof available online), and discuss its implementation within a model management framework.
SEFeb 5, 2021
Towards Modal Software EngineeringRamy Shahin
In this paper we introduce the notion of Modal Software Engineering: automatically turning sequential, deterministic programs into semantically equivalent programs efficiently operating on inputs coming from multiple overlapping worlds. We are drawing an analogy between modal logics, and software application domains where multiple sets of inputs (multiple worlds) need to be processed efficiently. Typically those sets highly overlap, so processing them independently would involve a lot of redundancy, resulting in lower performance, and in many cases intractability. Three application domains are presented: reasoning about feature-based variability of Software Product Lines (SPLs), probabilistic programming, and approximate programming.
PLOct 1, 2020
Automatic and Efficient Variability-Aware Lifting of Functional ProgramsRamy Shahin, Marsha Chechik
A software analysis is a computer program that takes some representation of a software product as input and produces some useful information about that product as output. A software product line encompasses \emph{many} software product variants, and thus existing analyses can be applied to each of the product variations individually, but not to the entire product line as a whole. Enumerating all product variants and analyzing them one by one is usually intractable due to the combinatorial explosion of the number of product variants with respect to product line features. Several software analyses (e.g., type checkers, model checkers, data flow analyses) have been redesigned/re-implemented to support variability. This usually requires a lot of time and effort, and the variability-aware version of the analysis might have new errors/bugs that do not exist in the original one. Given an analysis program written in a functional language based on PCF, in this paper we present two approaches to transforming (lifting) it into a semantically equivalent variability-aware analysis. A light-weight approach (referred to as \emph{shallow lifting}) wraps the analysis program into a variability-aware version, exploring all combinations of its input arguments. Deep lifting, on the other hand, is a program rewriting mechanism where the syntactic constructs of the input program are rewritten into their variability-aware counterparts. Compositionally this results in an efficient program semantically equivalent to the input program, modulo variability. We present the correctness criteria for functional program lifting, together with correctness proof sketches of our program transformations. We evaluate our approach on a set of program analyses applied to the BusyBox C-language product line.
PLDec 9, 2019
Variability-aware DatalogRamy Shahin, Marsha Chechik
Variability-aware computing is the efficient application of programs to different sets of inputs that exhibit some variability. One example is program analyses applied to Software Product Lines (SPLs). In this paper we present the design and development of a variability-aware version of the Soufflé Datalog engine. The engine can take facts annotated with Presence Conditions (PCs) as input, and compute the PCs of its inferred facts, eliminating facts that do not exist in any valid configuration. We evaluate our variability-aware Soufflé implementation on several fact sets annotated with PCs to measure the associated overhead in terms of processing time and database size.
SEJul 4, 2019
Lifting Datalog-Based Analyses to Software Product LinesRamy Shahin, Marsha Chechik, Rick Salay
Applying program analyses to Software Product Lines (SPLs) has been a fundamental research problem at the intersection of Product Line Engineering and software analysis. Different attempts have been made to "lift" particular product-level analyses to run on the entire product line. In this paper, we tackle the class of Datalog-based analyses (e.g., pointer and taint analyses), study the theoretical aspects of lifting Datalog inference, and implement a lifted inference algorithm inside the Soufflé Datalog engine. We evaluate our implementation on a set of benchmark product lines. We show significant savings in processing time and fact database size (billions of times faster on one of the benchmarks) compared to brute-force analysis of each product individually.