Christopher M. Poskitt

SE
h-index20
25papers
756citations
Novelty45%
AI Score51

25 Papers

CROct 30, 2025
Security Modelling for Cyber-Physical Systems: A Systematic Literature Review

Shaofei Huang, Christopher M. Poskitt, Lwin Khin Shar

Cyber-physical systems are at the intersection of digital technology and engineering domains, rendering them high-value targets of sophisticated and well-funded cybersecurity threat actors. Prominent cybersecurity attacks on CPS have brought attention to the vulnerability of these systems and the inherent weaknesses of critical infrastructure reliant on them. Security modelling for CPS is an important mechanism to systematically identify and assess vulnerabilities, threats, and risks throughout system life cycles, and to ultimately ensure system resilience, safety, and reliability. This survey delves into state-of-the-art research on CPS security modelling, encompassing both threat and attack modelling. While these terms are sometimes used interchangeably, they are different concepts. This paper elaborates on the differences between threat and attack modelling, examining their implications for CPS security. We conducted a systematic search that yielded 449 papers, from which 32 were selected and categorised into three clusters: those focused on threat modelling methods, attack modelling methods, and literature reviews. Specifically, we sought to examine what security modelling methods exist today, and how they address real-world cybersecurity threats and CPS-specific attacker capabilities throughout the life cycle of CPS, which typically span longer durations compared to traditional IT systems. This paper also highlights several limitations in existing research, wherein security models adopt simplistic approaches that do not adequately consider the dynamic, multi-layer, multi-path, and multi-agent characteristics of real-world cyber-physical attacks.

CVAug 8, 2023
How Generalizable are Deepfake Image Detectors? An Empirical Study

Boquan Li, Jun Sun, Christopher M. Poskitt et al.

Deepfakes are becoming increasingly credible, posing a significant threat given their potential to facilitate fraud or bypass access control systems. This has motivated the development of deepfake detection methods, in which deep learning models are trained to distinguish between real and synthesized footage. Unfortunately, existing detectors struggle to generalize to deepfakes from datasets they were not trained on, but little work has been done to examine why or how this limitation can be addressed. Especially, those single-modality deepfake images reveal little available forgery evidence, posing greater challenges than detecting deepfake videos. In this work, we present the first empirical study on the generalizability of deepfake detectors, an essential goal for detectors to stay one step ahead of attackers. Our study utilizes six deepfake datasets, five deepfake image detection methods, and two model augmentation approaches, confirming that detectors do not generalize in zero-shot settings. Additionally, we find that detectors are learning unwanted properties specific to synthesis methods and struggling to extract discriminative features, limiting their ability to generalize. Finally, we find that there are neurons universally contributing to detection across seen and unseen datasets, suggesting a possible path towards zero-shot generalizability.

SEJan 26
Rethinking Artifact Evaluation for Software Engineering in the Age of Generative AI

Christoph Treude, Christopher M. Poskitt, Rashina Hoda

Peer review in software engineering research operates under tight time constraints, while generative AI has substantially reduced the human effort required to produce polished research narratives. Reviewer attention is often spent on aspects of submissions such as writing quality or literature positioning that have become relatively less effort-intensive to address, rather than on evaluating the scientific substance of a paper. At the same time, assessing whether methods are implemented correctly, analyses are sound, and claims are supported by evidence remains effort-intensive and dependent on human expertise. In software engineering research, this substance is frequently embodied in artifacts, including code, data, evidence and analysis samples, and experimental infrastructure. In this position paper, we argue that artifact evaluation should be treated as a first-class component of peer review. We frame peer review as an attention allocation problem, examine how generative AI weakens narrative quality as a signal of rigor, and argue that artifact evaluation should play a more prominent role in peer review decisions.

90.2CRMar 28
SafeClaw-R: Towards Safe and Secure Multi-Agent Personal Assistants

Haoyu Wang, Zibo Xiao, Yedi Zhang et al.

LLM-based multi-agent systems (MASs) are transforming personal productivity by autonomously executing complex, cross-platform tasks. Frameworks such as OpenClaw demonstrate the potential of locally deployed agents integrated with personal data and services, but this autonomy introduces significant safety and security risks. Unintended actions from LLM reasoning failures can cause irreversible harm, while prompt injection attacks may exfiltrate credentials or compromise the system. Our analysis shows that 36.4% of OpenClaw's built-in skills pose high or critical risks. Existing approaches, including static guardrails and LLM-as-a-Judge, lack reliable real-time enforcement and consistent authority in MAS settings. To address this, we propose SafeClaw-R, a framework that enforces safety as a system-level invariant over the execution graph by ensuring that actions are mediated prior to execution, and systematically augments skills with safe counterparts. We evaluate SafeClaw-R across three representative domains: productivity platforms, third-party skill ecosystems, and code execution environments. SafeClaw-R achieves 95.2% accuracy in Google Workspace scenarios, significantly outperforming regex baselines (61.6%), detects 97.8% of malicious third-party skill patterns, and achieves 100% detection accuracy in our adversarial code execution benchmark. These results demonstrate that SafeClaw-R enables practical runtime enforcement for autonomous MASs.

CVSep 13, 2024
Natural Adversaries: Fuzzing Autonomous Vehicles with Realistic Roadside Object Placements

Yang Sun, Haoyu Wang, Christopher M. Poskitt et al.

The emergence of Autonomous Vehicles (AVs) has spurred research into testing the resilience of their perception systems, i.e., ensuring that they are not susceptible to critical misjudgements. It is important that these systems are tested not only with respect to other vehicles on the road, but also with respect to objects placed on the roadside. Trash bins, billboards, and greenery are examples of such objects, typically positioned according to guidelines developed for the human visual system, which may not align perfectly with the needs of AVs. Existing tests, however, usually focus on adversarial objects with conspicuous shapes or patches, which are ultimately unrealistic due to their unnatural appearance and reliance on white-box knowledge. In this work, we introduce a black-box attack on AV perception systems that creates realistic adversarial scenarios (i.e., satisfying road design guidelines) by manipulating the positions of common roadside objects and without resorting to "unnatural" adversarial patches. In particular, we propose TrashFuzz, a fuzzing algorithm that finds scenarios in which the placement of these objects leads to substantial AV misperceptions -- such as mistaking a traffic light's colour -- with the overall goal of causing traffic-law violations. To ensure realism, these scenarios must satisfy several rules encoding regulatory guidelines governing the placement of objects on public streets. We implemented and evaluated these attacks on the Apollo autonomous driving system, finding that TrashFuzz induced violations of 15 out of 24 traffic laws.

PLFeb 8, 2022Code
K-ST: A Formal Executable Semantics of the Structured Text Language for PLCs

Kun Wang, Jingyi Wang, Christopher M. Poskitt et al.

Programmable Logic Controllers (PLCs) are responsible for automating process control in many industrial systems (e.g. in manufacturing and public infrastructure), and thus it is critical to ensure that they operate correctly and safely. The majority of PLCs are programmed in languages such as Structured Text (ST). However, a lack of formal semantics makes it difficult to ascertain the correctness of their translators and compilers, which vary from vendor-to-vendor. In this work, we develop K-ST, a formal executable semantics for ST in the K framework. Defined with respect to the IEC 61131-3 standard and PLC vendor manuals, K-ST is a high-level reference semantics that can be used to evaluate the correctness and consistency of different ST implementations. We validate K-ST by executing 509 ST programs extracted from Github and comparing the results against existing commercial compilers (i.e., CODESYS, CX-Programmer, and GX Works2). We then apply K-ST to validate the implementation of the open source OpenPLC platform, comparing the executions of several test programs to uncover five bugs and nine functional defects in the compiler.

AIMar 24, 2025
AgentSpec: Customizable Runtime Enforcement for Safe and Reliable LLM Agents

Haoyu Wang, Christopher M. Poskitt, Jun Sun

Agents built on LLMs are increasingly deployed across diverse domains, automating complex decision-making and task execution. However, their autonomy introduces safety risks, including security vulnerabilities, legal violations, and unintended harmful actions. Existing mitigation methods, such as model-based safeguards and early enforcement strategies, fall short in robustness, interpretability, and adaptability. To address these challenges, we propose AgentSpec, a lightweight domain-specific language for specifying and enforcing runtime constraints on LLM agents. With AgentSpec, users define structured rules that incorporate triggers, predicates, and enforcement mechanisms, ensuring agents operate within predefined safety boundaries. We implement AgentSpec across multiple domains, including code execution, embodied agents, and autonomous driving, demonstrating its adaptability and effectiveness. Our evaluation shows that AgentSpec successfully prevents unsafe executions in over 90% of code agent cases, eliminates all hazardous actions in embodied agent tasks, and enforces 100% compliance by autonomous vehicles (AVs). Despite its strong safety guarantees, AgentSpec remains computationally lightweight, with overheads in milliseconds. By combining interpretability, modularity, and efficiency, AgentSpec provides a practical and scalable solution for enforcing LLM agent safety across diverse applications. We also automate the generation of rules using LLMs and assess their effectiveness. Our evaluation shows that the rules generated by OpenAI o1 achieve a precision of 95.56% and recall of 70.96% for embodied agents, successfully identify 87.26% of the risky code, and prevent AVs from breaking laws in 5 out of 8 scenarios.

39.4CRApr 7
From Incomplete Architecture to Quantified Risk: Multimodal LLM-Driven Security Assessment for Cyber-Physical Systems

Shaofei Huang, Christopher M. Poskitt, Lwin Khin Shar

Cyber-physical systems often contend with incomplete architectural documentation or outdated information resulting from legacy technologies, knowledge management gaps, and the complexity of integrating diverse subsystems over extended operational lifecycles. This architectural incompleteness impedes reliable security assessment, as inaccurate or missing architectural knowledge limits the identification of system dependencies, attack surfaces, and risk propagation pathways. To address this foundational challenge, this paper introduces ASTRAL (Architecture-Centric Security Threat Risk Assessment using LLMs), an architecture-centric security assessment technique implemented in a prototype tool powered by multimodal LLMs. The proposed approach assists practitioners in reconstructing and analysing CPS architectures when documentation is fragmented or absent. By leveraging prompt chaining, few-shot learning, and architectural reasoning, ASTRAL extracts and synthesises system representations from disparate data sources. By integrating LLM reasoning with architectural modelling, our approach supports adaptive threat identification and quantitative risk estimation for cyber-physical systems. We evaluated the approach through an ablation study across multiple CPS case studies and an expert evaluation involving 14 experienced cybersecurity practitioners. Practitioner feedback suggests that ASTRAL is useful and reliable for supporting architecture-centric security assessment. Overall, the results indicate that the approach can support more informed cyber risk management decisions.

SEMay 3, 2025
Runtime Anomaly Detection for Drones: An Integrated Rule-Mining and Unsupervised-Learning Approach

Ivan Tan, Wei Minn, Christopher M. Poskitt et al.

UAVs, commonly referred to as drones, have witnessed a remarkable surge in popularity due to their versatile applications. These cyber-physical systems depend on multiple sensor inputs, such as cameras, GPS receivers, accelerometers, and gyroscopes, with faults potentially leading to physical instability and serious safety concerns. To mitigate such risks, anomaly detection has emerged as a crucial safeguarding mechanism, capable of identifying the physical manifestations of emerging issues and allowing operators to take preemptive action at runtime. Recent anomaly detection methods based on LSTM neural networks have shown promising results, but three challenges persist: the need for models that can generalise across the diverse mission profiles of drones; the need for interpretability, enabling operators to understand the nature of detected problems; and the need for capturing domain knowledge that is difficult to infer solely from log data. Motivated by these challenges, this paper introduces RADD, an integrated approach to anomaly detection in drones that combines rule mining and unsupervised learning. In particular, we leverage rules (or invariants) to capture expected relationships between sensors and actuators during missions, and utilise unsupervised learning techniques to cover more subtle relationships that the rules may have missed. We implement this approach using the ArduPilot drone software in the Gazebo simulator, utilising 44 rules derived across the main phases of drone missions, in conjunction with an ensemble of five unsupervised learning models. We find that our integrated approach successfully detects 93.84% of anomalies over six types of faults with a low false positive rate (2.33%), and can be deployed effectively at runtime. Furthermore, RADD outperforms a state-of-the-art LSTM-based method in detecting the different types of faults evaluated in our study.

CYSep 19, 2021
Mind the Gap: Reimagining an Interactive Programming Course for the Synchronous Hybrid Classroom

Christopher M. Poskitt, Kyong Jin Shim, Yi Meng Lau et al.

COVID-19 has significantly affected universities, forcing many courses to be delivered entirely online. As countries bring the pandemic under control, a potential way to safely resume some face-to-face teaching is the synchronous hybrid classroom, in which physically and remotely attending students are taught simultaneously. This comes with challenges, however, including the risk that remotely attending students perceive a 'gap' between their engagement and that of their physical peers. In this experience report, we describe how an interactive programming course was adapted to hybrid delivery in a way that mitigated this risk. Our solution centred on the use of a professional communication platform - Slack - to equalise participation opportunities and to facilitate peer learning. Furthermore, to mitigate 'Zoom fatigue', we implemented a semi-flipped classroom, covering concepts in videos and using shorter lessons to consolidate them. Finally, we critically reflect on the results of a student survey and our own experiences of implementing the solution.

CYSep 18, 2021
Steps Before Syntax: Helping Novice Programmers Solve Problems using the PCDIT Framework

Oka Kurniawan, Cyrille Jégourel, Norman Tiong Seng Lee et al.

Novice programmers often struggle with problem solving due to the high cognitive loads they face. Furthermore, many introductory programming courses do not explicitly teach it, assuming that problem solving skills are acquired along the way. In this paper, we present 'PCDIT', a non-linear problem solving framework that provides scaffolding to guide novice programmers through the process of transforming a problem specification into an implemented and tested solution for an imperative programming language. A key distinction of PCDIT is its focus on developing concrete cases for the problem early without actually writing test code: students are instead encouraged to think about the abstract steps from inputs to outputs before mapping anything down to syntax. We reflect on our experience of teaching an introductory programming course using PCDIT, and report the results of a survey that suggests it helped students to break down challenging problems, organise their thoughts, and reach working solutions.

CRJun 15, 2021
Code Integrity Attestation for PLCs using Black Box Neural Network Predictions

Yuqi Chen, Christopher M. Poskitt, Jun Sun

Cyber-physical systems (CPSs) are widespread in critical domains, and significant damage can be caused if an attacker is able to modify the code of their programmable logic controllers (PLCs). Unfortunately, traditional techniques for attesting code integrity (i.e. verifying that it has not been modified) rely on firmware access or roots-of-trust, neither of which proprietary or legacy PLCs are likely to provide. In this paper, we propose a practical code integrity checking solution based on privacy-preserving black box models that instead attest the input/output behaviour of PLC programs. Using faithful offline copies of the PLC programs, we identify their most important inputs through an information flow analysis, execute them on multiple combinations to collect data, then train neural networks able to predict PLC outputs (i.e. actuator commands) from their inputs. By exploiting the black box nature of the model, our solution maintains the privacy of the original PLC code and does not assume that attackers are unaware of its presence. The trust instead comes from the fact that it is extremely hard to attack the PLC code and neural networks at the same time and with consistent outcomes. We evaluated our approach on a modern six-stage water treatment plant testbed, finding that it could predict actuator states from PLC inputs with near-100% accuracy, and thus could detect all 120 effective code mutations that we subjected the PLCs to. Finally, we found that it is not practically possible to simultaneously modify the PLC code and apply discreet adversarial noise to our attesters in a way that leads to consistent (mis-)predictions.

CRMay 22, 2021
Adversarial Attacks and Mitigation for Anomaly Detectors of Cyber-Physical Systems

Yifan Jia, Jingyi Wang, Christopher M. Poskitt et al.

The threats faced by cyber-physical systems (CPSs) in critical infrastructure have motivated research into a multitude of attack detection mechanisms, including anomaly detectors based on neural network models. The effectiveness of anomaly detectors can be assessed by subjecting them to test suites of attacks, but less consideration has been given to adversarial attackers that craft noise specifically designed to deceive them. While successfully applied in domains such as images and audio, adversarial attacks are much harder to implement in CPSs due to the presence of other built-in defence mechanisms such as rule checkers(or invariant checkers). In this work, we present an adversarial attack that simultaneously evades the anomaly detectors and rule checkers of a CPS. Inspired by existing gradient-based approaches, our adversarial attack crafts noise over the sensor and actuator values, then uses a genetic algorithm to optimise the latter, ensuring that the neural network and the rule checking system are both deceived.We implemented our approach for two real-world critical infrastructure testbeds, successfully reducing the classification accuracy of their detectors by over 50% on average, while simultaneously avoiding detection by rule checkers. Finally, we explore whether these attacks can be mitigated by training the detectors on adversarial samples.

CRJul 7, 2020
Towards Systematically Deriving Defence Mechanisms from Functional Requirements of Cyber-Physical Systems

Cheah Huei Yoong, Venkata Reddy Palleti, Arlindo Silva et al.

The threats faced by cyber-physical systems (CPSs) in critical infrastructure have motivated the development of different attack detection mechanisms, such as those that monitor for violations of invariants, i.e. properties that always hold in normal operation. Given the complexity of CPSs, several existing approaches focus on deriving invariants automatically from data logs, but these can miss possible system behaviours if they are not represented in that data. Furthermore, resolving any design flaws identified in this process is costly, as the CPS is already built. In this position paper, we propose a systematic method for deriving invariants before a CPS is built by analysing its functional requirements. Our method, inspired by the axiomatic design methodology for systems, iteratively analyses dependencies in the design to construct equations and process graphs that model the invariant relationships between CPS components. As a preliminary study, we applied it to the design of a water treatment plant testbed, implementing checkers for two invariants by using decision trees, and finding that they could detect some examples of attacks on the testbed with high accuracy and without false positives. Finally, we explore how developing our method further could lead to more robust CPSs and reduced costs by identifying design weaknesses before systems are implemented.

SEMay 28, 2020
Active Fuzzing for Testing and Securing Cyber-Physical Systems

Yuqi Chen, Bohan Xuan, Christopher M. Poskitt et al.

Cyber-physical systems (CPSs) in critical infrastructure face a pervasive threat from attackers, motivating research into a variety of countermeasures for securing them. Assessing the effectiveness of these countermeasures is challenging, however, as realistic benchmarks of attacks are difficult to manually construct, blindly testing is ineffective due to the enormous search spaces and resource requirements, and intelligent fuzzing approaches require impractical amounts of data and network access. In this work, we propose active fuzzing, an automatic approach for finding test suites of packet-level CPS network attacks, targeting scenarios in which attackers can observe sensors and manipulate packets, but have no existing knowledge about the payload encodings. Our approach learns regression models for predicting sensor values that will result from sampled network packets, and uses these predictions to guide a search for payload manipulations (i.e. bit flips) most likely to drive the CPS into an unsafe state. Key to our solution is the use of online active learning, which iteratively updates the models by sampling payloads that are estimated to maximally improve them. We evaluate the efficacy of active fuzzing by implementing it for a water purification plant testbed, finding it can automatically discover a test suite of flow, pressure, and over/underflow attacks, all with substantially less time, data, and network access than the most comparable approach. Finally, we demonstrate that our prediction models can also be utilised as countermeasures themselves, implementing them as anomaly detectors and early warning systems.

CRSep 12, 2019
Learning-Guided Network Fuzzing for Testing Cyber-Physical System Defences

Yuqi Chen, Christopher M. Poskitt, Jun Sun et al.

The threat of attack faced by cyber-physical systems (CPSs), especially when they play a critical role in automating public infrastructure, has motivated research into a wide variety of attack defence mechanisms. Assessing their effectiveness is challenging, however, as realistic sets of attacks to test them against are not always available. In this paper, we propose smart fuzzing, an automated, machine learning guided technique for systematically finding 'test suites' of CPS network attacks, without requiring any knowledge of the system's control programs or physical processes. Our approach uses predictive machine learning models and metaheuristic search algorithms to guide the fuzzing of actuators so as to drive the CPS into different unsafe physical states. We demonstrate the efficacy of smart fuzzing by implementing it for two real-world CPS testbeds---a water purification plant and a water distribution system---finding attacks that drive them into 27 different unsafe states involving water flow, pressure, and tank levels, including six that were not covered by an established attack benchmark. Finally, we use our approach to test the effectiveness of an invariant-based defence system for the water treatment plant, finding two attacks that were not detected by its physical invariant checks, highlighting a potential weakness that could be exploited in certain conditions.

SEJan 3, 2018
Learning from Mutants: Using Code Mutation to Learn and Monitor Invariants of a Cyber-Physical System

Yuqi Chen, Christopher M. Poskitt, Jun Sun

Cyber-physical systems (CPS) consist of sensors, actuators, and controllers all communicating over a network; if any subset becomes compromised, an attacker could cause significant damage. With access to data logs and a model of the CPS, the physical effects of an attack could potentially be detected before any damage is done. Manually building a model that is accurate enough in practice, however, is extremely difficult. In this paper, we propose a novel approach for constructing models of CPS automatically, by applying supervised machine learning to data traces obtained after systematically seeding their software components with faults ("mutants"). We demonstrate the efficacy of this approach on the simulator of a real-world water purification plant, presenting a framework that automatically generates mutants, collects data traces, and learns an SVM-based model. Using cross-validation and statistical model checking, we show that the learnt model characterises an invariant physical property of the system. Furthermore, we demonstrate the usefulness of the invariant by subjecting the system to 55 network and code-modification attacks, and showing that it can detect 85% of them from the data logs generated at runtime.

SEOct 11, 2017
A Semantics Comparison Workbench for a Concurrent, Asynchronous, Distributed Programming Language

Claudio Corrodi, Alexander Heußner, Christopher M. Poskitt

A number of high-level languages and libraries have been proposed that offer novel and simple to use abstractions for concurrent, asynchronous, and distributed programming. The execution models that realise them, however, often change over time---whether to improve performance, or to extend them to new language features---potentially affecting behavioural and safety properties of existing programs. This is exemplified by SCOOP, a message-passing approach to concurrent object-oriented programming that has seen multiple changes proposed and implemented, with demonstrable consequences for an idiomatic usage of its core abstraction. We propose a semantics comparison workbench for SCOOP with fully and semi-automatic tools for analysing and comparing the state spaces of programs with respect to different execution models or semantics. We demonstrate its use in checking the consistency of properties across semantics by applying it to a set of representative programs, and highlighting a deadlock-related discrepancy between the principal execution models of SCOOP. Furthermore, we demonstrate the extensibility of the workbench by generalising the formalisation of an execution model to support recently proposed extensions for distributed programming. Our workbench is based on a modular and parameterisable graph transformation semantics implemented in the GROOVE tool. We discuss how graph transformations are leveraged to atomically model intricate language abstractions, how the visual yet algebraic nature of the model can be used to ascertain soundness, and highlight how the approach could be applied to similar languages.

LGSep 15, 2017
Anomaly Detection for a Water Treatment System Using Unsupervised Machine Learning

Jun Inoue, Yoriyuki Yamagata, Yuqi Chen et al.

In this paper, we propose and evaluate the application of unsupervised machine learning to anomaly detection for a Cyber-Physical System (CPS). We compare two methods: Deep Neural Networks (DNN) adapted to time series data generated by a CPS, and one-class Support Vector Machines (SVM). These methods are evaluated against data from the Secure Water Treatment (SWaT) testbed, a scaled-down but fully operational raw water purification plant. For both methods, we first train detectors using a log generated by SWaT operating under normal conditions. Then, we evaluate the performance of both methods using a log generated by SWaT operating under 36 different attack scenarios. We find that our DNN generates fewer false positives than our one-class SVM while our SVM detects slightly more anomalies. Overall, our DNN has a slightly better F measure than our SVM. We discuss the characteristics of the DNN and one-class SVM used in this experiment, and compare the advantages and disadvantages of the two methods.

SESep 6, 2016
Towards Learning and Verifying Invariants of Cyber-Physical Systems by Code Mutation

Yuqi Chen, Christopher M. Poskitt, Jun Sun

Cyber-physical systems (CPS), which integrate algorithmic control with physical processes, often consist of physically distributed components communicating over a network. A malfunctioning or compromised component in such a CPS can lead to costly consequences, especially in the context of public infrastructure. In this short paper, we argue for the importance of constructing invariants (or models) of the physical behaviour exhibited by CPS, motivated by their applications to the control, monitoring, and attestation of components. To achieve this despite the inherent complexity of CPS, we propose a new technique for learning invariants that combines machine learning with ideas from mutation testing. We present a preliminary study on a water treatment system that suggests the efficacy of this approach, propose strategies for establishing confidence in the correctness of invariants, then summarise some research questions and the steps we are taking to investigate them.

DCApr 15, 2016
An Interference-Free Programming Model for Network Objects

Mischael Schill, Christopher M. Poskitt, Bertrand Meyer

Network objects are a simple and natural abstraction for distributed object-oriented programming. Languages that support network objects, however, often leave synchronization to the user, along with its associated pitfalls, such as data races and the possibility of failure. In this paper, we present D-SCOOP, a distributed programming model that allows for interference-free and transaction-like reasoning on (potentially multiple) network objects, with synchronization handled automatically, and network failures managed by a compensation mechanism. We achieve this by leveraging the runtime semantics of a multi-threaded object-oriented concurrency model, directly generalizing it with a message-based protocol for efficiently coordinating remote objects. We present our pathway to fusing these contrasting but complementary ideas, and evaluate the performance overhead of the automatic synchronization in D-SCOOP, finding that it comes close to---or outperforms---explicit locking-based synchronization in Java RMI.

SEMar 1, 2016
A Graph-Based Semantics Workbench for Concurrent Asynchronous Programs

Claudio Corrodi, Alexander Heußner, Christopher M. Poskitt

A number of novel programming languages and libraries have been proposed that offer simpler-to-use models of concurrency than threads. It is challenging, however, to devise execution models that successfully realise their abstractions without forfeiting performance or introducing unintended behaviours. This is exemplified by SCOOP---a concurrent object-oriented message-passing language---which has seen multiple semantics proposed and implemented over its evolution. We propose a "semantics workbench" with fully and semi-automatic tools for SCOOP, that can be used to analyse and compare programs with respect to different execution models. We demonstrate its use in checking the consistency of semantics by applying it to a set of representative programs, and highlighting a deadlock-related discrepancy between the principal execution models of the language. Our workbench is based on a modular and parameterisable graph transformation semantics implemented in the GROOVE tool. We discuss how graph transformations are leveraged to atomically model intricate language abstractions, and how the visual yet algebraic nature of the model can be used to ascertain soundness.

SEAug 17, 2015
The AutoProof Verifier: Usability by Non-Experts and on Standard Code

Carlo A. Furia, Christopher M. Poskitt, Julian Tschannen

Formal verification tools are often developed by experts for experts; as a result, their usability by programmers with little formal methods experience may be severely limited. In this paper, we discuss this general phenomenon with reference to AutoProof: a tool that can verify the full functional correctness of object-oriented software. In particular, we present our experiences of using AutoProof in two contrasting contexts representative of non-expert usage. First, we discuss its usability by students in a graduate course on software verification, who were tasked with verifying implementations of various sorting algorithms. Second, we evaluate its usability in verifying code developed for programming assignments of an undergraduate course. The first scenario represents usability by serious non-experts; the second represents usability on "standard code", developed without full functional verification in mind. We report our experiences and lessons learnt, from which we derive some general suggestions for furthering the development of verification tools with respect to improving their usability.

SEApr 10, 2015
Towards Practical Graph-Based Verification for an Object-Oriented Concurrency Model

Alexander Heußner, Christopher M. Poskitt, Claudio Corrodi et al.

To harness the power of multi-core and distributed platforms, and to make the development of concurrent software more accessible to software engineers, different object-oriented concurrency models such as SCOOP have been proposed. Despite the practical importance of analysing SCOOP programs, there are currently no general verification approaches that operate directly on program code without additional annotations. One reason for this is the multitude of partially conflicting semantic formalisations for SCOOP (either in theory or by-implementation). Here, we propose a simple graph transformation system (GTS) based run-time semantics for SCOOP that grasps the most common features of all known semantics of the language. This run-time model is implemented in the state-of-the-art GTS tool GROOVE, which allows us to simulate, analyse, and verify a subset of SCOOP programs with respect to deadlocks and other behavioural properties. Besides proposing the first approach to verify SCOOP programs by automatic translation to GTS, we also highlight our experiences of applying GTS (and especially GROOVE) for specifying semantics in the form of a run-time model, which should be transferable to GTS models for other concurrent languages and libraries.

DCOct 24, 2014
Contract-Based General-Purpose GPU Programming

Alexey Kolesnichenko, Christopher M. Poskitt, Sebastian Nanz et al.

Using GPUs as general-purpose processors has revolutionized parallel computing by offering, for a large and growing set of algorithms, massive data-parallelization on desktop machines. An obstacle to widespread adoption, however, is the difficulty of programming them and the low-level control of the hardware required to achieve good performance. This paper suggests a programming library, SafeGPU, that aims at striking a balance between programmer productivity and performance, by making GPU data-parallel operations accessible from within a classical object-oriented programming language. The solution is integrated with the design-by-contract approach, which increases confidence in functional program correctness by embedding executable program specifications into the program text. We show that our library leads to modular and maintainable code that is accessible to GPGPU non-experts, while providing performance that is comparable with hand-written CUDA code. Furthermore, runtime contract checking turns out to be feasible, as the contracts can be executed on the GPU.