SESep 25, 2023
Guess & Sketch: Language Model Guided TranspilationCeline Lee, Abdulrahman Mahmoud, Michal Kurek et al.
Maintaining legacy software requires many software and systems engineering hours. Assembly code programs, which demand low-level control over the computer machine state and have no variable names, are particularly difficult for humans to analyze. Existing conventional program translators guarantee correctness, but are hand-engineered for the source and target programming languages in question. Learned transpilation, i.e. automatic translation of code, offers an alternative to manual re-writing and engineering efforts. Automated symbolic program translation approaches guarantee correctness but struggle to scale to longer programs due to the exponentially large search space. Their rigid rule-based systems also limit their expressivity, so they can only reason about a reduced space of programs. Probabilistic neural language models (LMs) produce plausible outputs for every input, but do so at the cost of guaranteed correctness. In this work, we leverage the strengths of LMs and symbolic solvers in a neurosymbolic approach to learned transpilation for assembly code. Assembly code is an appropriate setting for a neurosymbolic approach, since assembly code can be divided into shorter non-branching basic blocks amenable to the use of symbolic methods. Guess & Sketch extracts alignment and confidence information from features of the LM then passes it to a symbolic solver to resolve semantic equivalence of the transpilation input and output. We test Guess & Sketch on three different test sets of assembly transpilation tasks, varying in difficulty, and show that it successfully transpiles 57.6% more examples than GPT-4 and 39.6% more examples than an engineered transpiler. We also share a training and evaluation dataset for this task.
CRJun 3, 2021
Relational Analysis of Sensor Attacks on Cyber-Physical SystemsJian Xiang, Nathan Fulton, Stephen Chong
Cyber-physical systems, such as self-driving cars or autonomous aircraft, must defend against attacks that target sensor hardware. Analyzing system design can help engineers understand how a compromised sensor could impact the system's behavior; however, designing security analyses for cyber-physical systems is difficult due to their combination of discrete dynamics, continuous dynamics, and nondeterminism. This paper contributes a framework for modeling and analyzing sensor attacks on cyber-physical systems, using the formalism of hybrid programs. We formalize and analyze two relational properties of a system's robustness. These relational properties respectively express (1) whether a system's safety property can be influenced by sensor attacks, and (2) whether a system's high-integrity state can be affected by sensor attacks. We characterize these relational properties by defining an equivalence relation between a system under attack and the original unattacked system. That is, the system satisfies the robustness properties if executions of the attacked system are appropriately related to executions of the unattacked system. We present two techniques for reasoning about the equivalence relation and thus proving the relational properties for a system. One proof technique decomposes large proof obligations to smaller proof obligations. The other proof technique adapts the self-composition technique from the literature on secure information-flow, allowing us to reduce reasoning about the equivalence of two systems to reasoning about properties of a single system. This technique allows us to reuse existing tools for reasoning about properties of hybrid programs, but is challenging due to the combination of discrete dynamics, continuous dynamics, and nondeterminism. To evaluate, we present three case studies motivated by real design flaws in existing cyber-physical systems.
CRApr 21, 2021
A Calculus for Flow-Limited AuthorizationOwen Arden, Anitha Gollamudi, Ethan Cecchetti et al.
Real-world applications routinely make authorization decisions based on dynamic computation. Reasoning about dynamically computed authority is challenging. Integrity of the system might be compromised if attackers can improperly influence the authorizing computation. Confidentiality can also be compromised by authorization, since authorization decisions are often based on sensitive data such as membership lists and passwords. Previous formal models for authorization do not fully address the security implications of permitting trust relationships to change, which limits their ability to reason about authority that derives from dynamic computation. Our goal is an approach to constructing dynamic authorization mechanisms that do not violate confidentiality or integrity. The Flow-Limited Authorization Calculus (FLAC) is a simple, expressive model for reasoning about dynamic authorization as well as an information flow control language for securely implementing various authorization mechanisms. FLAC combines the insights of two previous models: it extends the Dependency Core Calculus with features made possible by the Flow-Limited Authorization Model. FLAC provides strong end-to-end information security guarantees even for programs that incorporate and implement rich dynamic authorization mechanisms. These guarantees include noninterference and robust declassification, which prevent attackers from influencing information disclosures in unauthorized ways. We prove these security properties formally for all FLAC programs and explore the expressiveness of FLAC with several examples.
CROct 22, 2019
Formalizing Privacy Laws for License Generation and Data Repository Decision AutomationMicah Altman, Stephen Chong, Alexandra Wood
In this paper, we summarize work-in-progress on expert system support to automate some data deposit and release decisions within a data repository, and to generate custom license agreements for those data transfers. Our approach formalizes via a logic programming language the privacy-relevant aspects of laws, regulations, and best practices, supported by legal analysis documented in legal memoranda. This formalization enables automated reasoning about the conditions under which a repository can transfer data, through interrogation of users, and the application of formal rules to the facts obtained from users. The proposed system takes the specific conditions for a given data release and produces a custom data use agreement that accurately captures the relevant restrictions on data use. This enables appropriate decisions and accurate licenses, while removing the bottleneck of lawyer effort per data transfer. The operation of the system aims to be transparent, in the sense that administrators, lawyers, institutional review boards, and other interested parties can evaluate the legal reasoning and interpretation embodied in the formalization, and the specific rationale for a decision to accept or release a particular dataset.
CRAug 29, 2017
Cryptographically Secure Information Flow Control on Key-Value StoresLucas Waye, Pablo Buiras, Owen Arden et al.
We present Clio, an information flow control (IFC) system that transparently incorporates cryptography to enforce confidentiality and integrity policies on untrusted storage. Clio insulates developers from explicitly manipulating keys and cryptographic primitives by leveraging the policy language of the IFC system to automatically use the appropriate keys and correct cryptographic operations. We prove that Clio is secure with a novel proof technique that is based on a proof style from cryptography together with standard programming languages results. We present a prototype Clio implementation and a case study that demonstrates Clio's practicality.
CRAug 2, 2016
Report on the NSF Workshop on Formal Methods for SecurityStephen Chong, Joshua Guttman, Anupam Datta et al.
Report on the NSF Workshop on Formal Methods for Security, held 19-20 November 2015.
CRSep 1, 2014
Using Architecture to Reason about Information SecurityStephen Chong, Ron van der Meyden
We demonstrate, by a number of examples, that information-flow security properties can be proved from abstract architectural descriptions, that describe only the causal structure of a system and local properties of trusted components. We specify these architectural descriptions of systems by generalizing intransitive noninterference policies to admit the ability to filter information passed between communicating domains. A notion of refinement of such system architectures is developed that supports top-down development of architectural specifications and proofs by abstraction of information security properties. We also show that, in a concrete setting where the causal structure is enforced by access control, a static check of the access control setting plus local verification of the trusted components is sufficient to prove that a generalized intransitive noninterference policy is satisfied.