AIJun 22, 2022
On Specifying for TrustworthinessDhaminda B. Abeywickrama, Amel Bennaceur, Greg Chance et al.
As autonomous systems (AS) increasingly become part of our daily lives, ensuring their trustworthiness is crucial. In order to demonstrate the trustworthiness of an AS, we first need to specify what is required for an AS to be considered trustworthy. This roadmap paper identifies key challenges for specifying for trustworthiness in AS, as identified during the "Specifying for Trustworthiness" workshop held as part of the UK Research and Innovation (UKRI) Trustworthy Autonomous Systems (TAS) programme. We look across a range of AS domains with consideration of the resilience, trust, functionality, verifiability, security, and governance and regulation of AS and identify some of the key specification challenges in these domains. We then highlight the intellectual challenges that are involved with specifying for trustworthiness in AS that cut across domains and are exacerbated by the inherent uncertainty involved with the environments in which AS need to operate.
SYNov 2, 2015Code
Evaluating Model Checking Approaches to Verify Stability of Control Systems in SimulinkDejanira Araiza-Illan, Kerstin Eder
This paper examines the verification of stability, a control requirement, over discrete control systems represented as Simulink diagrams, using different model checking approaches and tools. Model checking comprises the (exhaustive) exploration of a model of a system, to determine if a requirement is satisfied. If that is not the case, examples of the requirement's violation within the system's model are provided, as witnesses. These examples are potentially complementary to previous work on automatic theorem proving, when a system is not proven to be stable, but no proof of instability can be provided. We experimentally evaluated the suitability of four model checking approaches to verify stability on a set of benchmarks including linear and nonlinear, controlled and uncontrolled, discrete systems, via Lyapunov's second method or Lyapunov's direct method. Our study included symbolic, bounded, statistical and hybrid model checking, through the open-source tools NuSMV, UCLID, S-TaLiRo and SpaceEx, respectively. Our experiments and results provide an insight on the strengths and limitations of these model checking approaches for the verification of control requirements for discrete systems at Simulink level. We found that statistical model checking with S-TaLiRo is the most suitable option to complement our previous work on automatic theorem proving.
ROFeb 20, 2023
AERoS: Assurance of Emergent Behaviour in Autonomous Robotic SwarmsDhaminda B. Abeywickrama, James Wilson, Suet Lee et al.
The behaviours of a swarm are not explicitly engineered. Instead, they are an emergent consequence of the interactions of individual agents with each other and their environment. This emergent functionality poses a challenge to safety assurance. The main contribution of this paper is a process for the safety assurance of emergent behaviour in autonomous robotic swarms called AERoS, following the guidance on the Assurance of Machine Learning for use in Autonomous Systems (AMLAS). We explore our proposed process using a case study centred on a robot swarm operating a public cloakroom.
ARMay 17, 2022
Supervised Learning for Coverage-Directed Test Selection in Simulation-Based VerificationNyasha Masamba, Kerstin Eder, Tim Blackmore
Constrained random test generation is one of the most widely adopted methods for generating stimuli for simulation-based verification. Randomness leads to test diversity, but tests tend to repeatedly exercise the same design logic. Constraints are written (typically manually) to bias random tests towards interesting, hard-to-reach, and yet-untested logic. However, as verification progresses, most constrained random tests yield little to no effect on functional coverage. If stimuli generation consumes significantly less resources than simulation, then a better approach involves randomly generating a large number of tests, selecting the most effective subset, and only simulating that subset. In this paper, we introduce a novel method for automatic constraint extraction and test selection. This method, which we call coverage-directed test selection, is based on supervised learning from coverage feedback. Our method biases selection towards tests that have a high probability of increasing functional coverage, and prioritises them for simulation. We show how coverage-directed test selection can reduce manual constraint writing, prioritise effective tests, reduce verification resource consumption, and accelerate coverage closure on a large, real-life industrial hardware design.
ROJul 3, 2023
Soft Gripping: Specifying for TrustworthinessDhaminda B. Abeywickrama, Nguyen Hao Le, Greg Chance et al.
Soft robotics is an emerging technology in which engineers create flexible devices for use in a variety of applications. In order to advance the wide adoption of soft robots, ensuring their trustworthiness is essential; if soft robots are not trusted, they will not be used to their full potential. In order to demonstrate trustworthiness, a specification needs to be formulated to define what is trustworthy. However, even for soft robotic grippers, which is one of the most mature areas in soft robotics, the soft robotics community has so far given very little attention to formulating specifications. In this work, we discuss the importance of developing specifications during development of soft robotic systems, and present an extensive example specification for a soft gripper for pick-and-place tasks for grocery items. The proposed specification covers both functional and non-functional requirements, such as reliability, safety, adaptability, predictability, ethics, and regulations. We also highlight the need to promote verifiability as a first-class objective in the design of a soft gripper.
SEJul 1, 2022
Using Neural Networks for Novelty-based Test Selection to Accelerate Functional Coverage ClosureXuan Zheng, Kerstin Eder, Tim Blackmore
Novel test selectors used in simulation-based verification have been shown to significantly accelerate coverage closure regardless of the number of coverage holes. This paper presents a configurable and highly-automated framework for novel test selection based on neural networks. Three configurations of this framework are tested with a commercial signal processing unit. All three convincingly outperform random test selection with the largest saving of simulation being 49.37% to reach 99.5% coverage. The computational expense of the configurations is negligible compared to the simulation reduction. We compare the experimental results and discuss important characteristics related to the performance of the configurations.
LGJan 21, 2023
A Trustworthiness Score to Evaluate DNN PredictionsAbanoub Ghobrial, Darryl Hond, Hamid Asgari et al.
Due to the black box nature of deep neural networks (DNN), the continuous validation of DNN during operation is challenging with the absence of a human monitor. As a result this makes it difficult for developers and regulators to gain confidence in the deployment of autonomous systems employing DNN. It is critical for safety during operation to know when DNN's predictions are trustworthy or suspicious. With the absence of a human monitor, the basic approach is to use the model's output confidence score to assess if predictions are trustworthy or suspicious. However, the model's confidence score is a result of computations coming from a black box, therefore lacks transparency and makes it challenging to automatedly credit trustworthiness to predictions. We introduce the trustworthiness score (TS), a simple metric that provides a more transparent and effective way of providing confidence in DNN predictions compared to model's confidence score. The metric quantifies the trustworthiness in a prediction by checking for the existence of certain features in the predictions made by the DNN. We also use the underlying idea of the TS metric, to provide a suspiciousness score (SS) in the overall input frame to help in the detection of suspicious frames where false negatives exist. We conduct a case study using YOLOv5 on persons detection to demonstrate our method and usage of TS and SS. The case study shows that using our method consistently improves the precision of predictions compared to relying on model confidence score alone, for both 1) approving of trustworthy predictions (~20% improvement) and 2) detecting suspicious frames (~5% improvement).
ARMay 19, 2022
Hybrid Intelligent Testing in Simulation-Based VerificationNyasha Masamba, Kerstin Eder, Tim Blackmore
Efficient and effective testing for simulation-based hardware verification is challenging. Using constrained random test generation, several millions of tests may be required to achieve coverage goals. The vast majority of tests do not contribute to coverage progress, yet they consume verification resources. In this paper, we propose a hybrid intelligent testing approach combining two methods that have previously been treated separately, namely Coverage-Directed Test Selection and Novelty-Driven Verification. Coverage-Directed Test Selection learns from coverage feedback to bias testing toward the most effective tests. Novelty-Driven Verification learns to identify and simulate stimuli that differ from previous stimuli, thereby reducing the number of simulations and increasing testing efficiency. We discuss the strengths and limitations of each method, and we show how our approach addresses each method's limitations, leading to hardware testing that is both efficient and effective.
LGApr 30, 2022
DIRA: Dynamic Domain Incremental Regularised AdaptationAbanoub Ghobrial, Xuan Zheng, Darryl Hond et al.
Autonomous systems (AS) often use Deep Neural Network (DNN) classifiers to allow them to operate in complex, high-dimensional, non-linear, and dynamically changing environments. Due to the complexity of these environments, DNN classifiers may output misclassifications during operation when they face domains not identified during development. Removing a system from operation for retraining becomes impractical as the number of such AS increases. To increase AS reliability and overcome this limitation, DNN classifiers need to have the ability to adapt during operation when faced with different operational domains using a few samples (e.g. 2 to 100 samples). However, retraining DNNs on a few samples is known to cause catastrophic forgetting and poor generalisation. In this paper, we introduce Dynamic Incremental Regularised Adaptation (DIRA), an approach for dynamic operational domain adaption of DNNs using regularisation techniques. We show that DIRA improves on the problem of forgetting and achieves strong gains in performance when retraining using a few samples from the target domain. Our approach shows improvements on different image classification benchmarks aimed at evaluating robustness to distribution shifts (e.g.CIFAR-10C/100C, ImageNet-C), and produces state-of-the-art performance in comparison with other methods from the literature.
ROJul 22, 2024
Autonomous Robotic Swarms: A Corroborative Approach for Verification and ValidationDhaminda B. Abeywickrama, Suet Lee, Chris Bennett et al.
The emergent behaviour of autonomous robotic swarms poses a significant challenge to their safety assurance. Assurance tasks encompass adherence to standards, certification processes, and the execution of verification and validation (V&V) methods, such as model checking. In this study, we propose a corroborative approach for formally verifying and validating autonomous robotic swarms, which are defined at the macroscopic formal modelling, low-fidelity simulation, high-fidelity simulation, and real-robot levels. Our formal macroscopic models, used for verification, are characterised by data derived from actual simulations to ensure both accuracy and traceability across different swarm system models. Furthermore, our work combines formal verification with simulations and experimental validation using real robots. In this way, our corroborative approach for V&V seeks to enhance confidence in the evidence, in contrast to employing these methods separately. We explore our approach through a case study focused on a swarm of robots operating within a public cloakroom.
ARMar 5, 2025Code
Review of Machine Learning for Micro-Electronic Design VerificationChristopher Bennett, Kerstin Eder
Microelectronic design verification remains a critical bottleneck in device development, traditionally mitigated by expanding verification teams and computational resources. Since the late 1990s, machine learning (ML) has been proposed to enhance verification efficiency, yet many techniques have not achieved mainstream adoption. This review, from the perspective of verification and ML practitioners, examines the application of ML in dynamic-based techniques for functional verification of microelectronic designs, and provides a starting point for those new to this interdisciplinary field. Historical trends, techniques, ML types, and evaluation baselines are analysed to understand why previous research has not been widely adopted in industry. The review highlights the application of ML, the techniques used and critically discusses their limitations and successes. Although there is a wealth of promising research, real-world adoption is hindered by challenges in comparing techniques, identifying suitable applications, and the expertise required for implementation. This review proposes that the field can progress through the creation and use of open datasets, common benchmarks, and verification targets. By establishing open evaluation criteria, industry can guide future research. Parallels with ML in software verification suggest potential for collaboration. Additionally, greater use of open-source designs and verification environments can allow more researchers from outside the hardware verification discipline to contribute to the challenge of verifying microelectronic designs.
ROApr 7, 2021Code
On Determinism of Game Engines used for Simulation-based Autonomous Vehicle VerificationGreg Chance, Abanoub Ghobrial, Kevin McAreavey et al.
Game engines are increasingly used as simulation platforms by the autonomous vehicle (AV) community to develop vehicle control systems and test environments. A key requirement for simulation-based development and verification is determinism, since a deterministic process will always produce the same output given the same initial conditions and event history. Thus, in a deterministic simulation environment, tests are rendered repeatable and yield simulation results that are trustworthy and straightforward to debug. However, game engines are seldom deterministic. This paper reviews and identifies the potential causes of non-deterministic behaviours in game engines. A case study using CARLA, an open-source autonomous driving simulation environment powered by Unreal Engine, is presented to highlight its inherent shortcomings in providing sufficient precision in experimental results. Different configurations and utilisations of the software and hardware are explored to determine an operational domain where the simulation precision is sufficiently low i.e.\ variance between repeated executions becomes negligible for development and testing work. Finally, a method of a general nature is proposed, that can be used to find the domains of permissible variance in game engine simulations for any given system configuration.
DCSep 9, 2015Code
Modeling and visualizing networked multi-core embedded software energy consumptionSteve Kerrison, Kerstin Eder
In this report we present a network-level multi-core energy model and a software development process workflow that allows software developers to estimate the energy consumption of multi-core embedded programs. This work focuses on a high performance, cache-less and timing predictable embedded processor architecture, XS1. Prior modelling work is improved to increase accuracy, then extended to be parametric with respect to voltage and frequency scaling (VFS) and then integrated into a larger scale model of a network of interconnected cores. The modelling is supported by enhancements to an open source instruction set simulator to provide the first network timing aware simulations of the target architecture. Simulation based modelling techniques are combined with methods of results presentation to demonstrate how such work can be integrated into a software developer's workflow, enabling the developer to make informed, energy aware coding decisions. A set of single-, multi-threaded and multi-core benchmarks are used to exercise and evaluate the models and provide use case examples for how results can be presented and interpreted. The models all yield accuracy within an average +/-5 % error margin.
SEMay 31, 2014Code
EACOF: A Framework for Providing Energy Transparency to enable Energy-Aware Software DevelopmentHayden Field, Glen Anderson, Kerstin Eder
Making energy consumption data accessible to software developers is an essential step towards energy efficient software engineering. The presence of various different, bespoke and incompatible, methods of instrumentation to obtain energy readings is currently limiting the widespread use of energy data in software development. This paper presents EACOF, a modular Energy-Aware Computing Framework that provides a layer of abstraction between sources of energy data and the applications that exploit them. EACOF replaces platform specific instrumentation through two APIs - one accepts input to the framework while the other provides access to application software. This allows developers to profile their code for energy consumption in an easy and portable manner using simple API calls. We outline the design of our framework and provide details of the API functionality. In a use case, where we investigate the impact of data bit width on the energy consumption of various sorting algorithms, we demonstrate that the data obtained using EACOF provides interesting, sometimes counter-intuitive, insights. All the code is available online under an open source license. http://github.com/eacof
LGNov 13, 2023
On Self-Supervised Dynamic Incremental Regularised AdaptationAbanoub Ghobrial, Kerstin Eder
In this paper, we give an overview of a recently developed method for dynamic domain adaptation, named DIRA, which relies on a few samples in addition to a regularisation approach, named elastic weight consolidation, to achieve state-of-the-art (SOTA) domain adaptation results. DIRA has been previously shown to perform competitively with SOTA unsupervised adaption techniques. However, a limitation of DIRA is that it relies on labels to be provided for the few samples used in adaption. This makes it a supervised technique. In this paper, we propose a modification to the DIRA method to make it self-supervised i.e. remove the need for providing labels. Our proposed approach will be evaluated experimentally in future work.
8.5AIApr 2
Systematic Analyses of Reinforcement Learning Controllers in Signalized Urban CorridorsXiaofei Song, Kerstin Eder, Jonathan Lawry et al.
In this work, we extend our systematic capacity region perspective to multi-junction traffic networks, focussing on the special case of an urban corridor network. In particular, we train and evaluate centralized, fully decentralized, and parameter-sharing decentralized RL controllers, and compare their capacity regions and ATTs together with a classical baseline MaxPressure controller. Further, we show how the parametersharing controller may be generalised to be deployed on a larger network than it was originally trained on. In this setting, we show some initial findings that suggest that even though the junctions are not formally coordinated, traffic may self organise into `green waves'.
SEJun 19, 2024
Detecting Stimuli with Novel Temporal Patterns to Accelerate Functional Coverage ClosureXuan Zheng, Tim Blackmore, James Buckingham et al.
Novel test selectors have demonstrated their effectiveness in accelerating the closure of functional coverage for various industrial digital designs in simulation-based verification. The primary advantages of these test selectors include performance that is not impacted by coverage holes, straightforward implementation, and relatively low computational expense. However, the detection of stimuli with novel temporal patterns remains largely unexplored. This paper introduces two novel test selectors designed to identify such stimuli. The experiments reveal that both test selectors can accelerate the functional coverage for a commercial bus bridge, compared to random test selection. Specifically, one selector achieves a 26.9\% reduction in the number of simulated tests required to reach 98.5\% coverage, outperforming the savings achieved by two previously published test selectors by factors of 13 and 2.68, respectively.
LGMay 18, 2023
Evaluation Metrics for DNNs CompressionAbanoub Ghobrial, Samuel Budgett, Dieter Balemans et al.
There is a lot of ongoing research effort into developing different techniques for neural networks compression. However, the community lacks standardised evaluation metrics, which are key to identifying the most suitable compression technique for different applications. This paper reviews existing neural network compression evaluation metrics and implements them into a standardisation framework called NetZIP. We introduce two novel metrics to cover existing gaps of evaluation in the literature: 1) Compression and Hardware Agnostic Theoretical Speed (CHATS) and 2) Overall Compression Success (OCS). We demonstrate the use of NetZIP using two case studies on two different hardware platforms (a PC and a Raspberry Pi 4) focusing on object classification and object detection.
AIMay 5, 2023
Assessing Trustworthiness of Autonomous SystemsGregory Chance, Dhaminda B. Abeywickrama, Beckett LeClair et al.
As Autonomous Systems (AS) become more ubiquitous in society, more responsible for our safety and our interaction with them more frequent, it is essential that they are trustworthy. Assessing the trustworthiness of AS is a mandatory challenge for the verification and development community. This will require appropriate standards and suitable metrics that may serve to objectively and comparatively judge trustworthiness of AS across the broad range of current and future applications. The meta-expression `trustworthiness' is examined in the context of AS capturing the relevant qualities that comprise this term in the literature. Recent developments in standards and frameworks that support assurance of autonomous systems are reviewed. A list of key challenges are identified for the community and we present an outline of a process that can be used as a trustworthiness assessment framework for AS.
AIApr 29, 2021
D-VAL: An automatic functional equivalence validation tool for planning domain modelsAnas Shrinah, Derek Long, Kerstin Eder
This paper introduces an approach to validate the functional equivalence of planning domain models. Validating the functional equivalence of planning domain models is the problem of formally confirming that two planning domain models can be used to solve the same set of problems for any set of objects. The need for techniques to validate the functional equivalence of planning domain models has been highlighted in previous research and has applications in model learning, development and extension. We prove the soundness and completeness of our method. We also develop D-VAL, an automatic functional equivalence validation tool for planning domain models. Empirical evaluation shows that D-VAL validates the functional equivalence of all examined domains in less than 43 seconds. Additionally, we provide a benchmark to evaluate the feasibility and performance of this and future related work.
SEApr 2, 2021
A Comprehensive and Accurate Energy Model for Arm's Cortex-M0 ProcessorKyriakos Georgiou, Zbigniew Chamski, Kris Nikov et al.
Energy modeling can enable energy-aware software development and assist the developer in meeting an application's energy budget. Although many energy models for embedded processors exist, most do not account for processor-specific configurations, neither are they suitable for static energy consumption estimation. This paper introduces a comprehensive energy model for Arm's Cortex-M0 processor, ready to support energy-aware development of edge computing applications using either profiling- or static-analysis-based energy consumption estimation. The model accounts for the Frequency, PreFetch, and WaitState processor configurations which all have a significant impact on the execution time and energy consumption of edge computing applications. All models have a prediction error of less than 5%.
ROJun 30, 2020
Formalizing and Guaranteeing* Human-Robot InteractionHadas Kress-Gazit, Kerstin Eder, Guy Hoffman et al.
Robot capabilities are maturing across domains, from self-driving cars, to bipeds and drones. As a result, robots will soon no longer be confined to safety-controlled industrial settings; instead, they will directly interact with the general public. The growing field of Human-Robot Interaction (HRI) studies various aspects of this scenario - from social norms to joint action to human-robot teams and more. Researchers in HRI have made great strides in developing models, methods, and algorithms for robots acting with and around humans, but these "computational HRI" models and algorithms generally do not come with formal guarantees and constraints on their operation. To enable human-interactive robots to move from the lab to real-world deployments, we must address this gap. This article provides an overview of verification, validation and synthesis techniques used to create demonstrably trustworthy systems, describes several HRI domains that could benefit from such techniques, and provides a roadmap for the challenges and the research needed to create formalized and guaranteed human-robot interaction.
CRJun 26, 2020
CyRes -- Avoiding Catastrophic Failure in Connected and Autonomous Vehicles (Extended Abstract)Carsten Maple, Peter Davies, Kerstin Eder et al.
Existing approaches to cyber security and regulation in the automotive sector cannot achieve the quality of outcome necessary to ensure the safe mass deployment of advanced vehicle technologies and smart mobility systems. Without sustainable resilience hard-fought public trust will evaporate, derailing emerging global initiatives to improve the efficiency, safety and environmental impact of future transport. This paper introduces an operational cyber resilience methodology, CyRes, that is suitable for standardisation. The CyRes methodology itself is capable of being tested in court or by publicly appointed regulators. It is designed so that operators understand what evidence should be produced by it and are able to measure the quality of that evidence. The evidence produced is capable of being tested in court or by publicly appointed regulators. Thus, the real-world system to which the CyRes methodology has been applied is capable of operating at all times and in all places with a legally and socially acceptable value of negative consequence.
AINov 22, 2018
Goal-constrained Planning Domain Model Verification of Safety PropertiesAnas Shrinah, Kerstin Eder
The verification of planning domain models is crucial to ensure the safety, integrity and correctness of planning-based automated systems. This task is usually performed using model checking techniques. However, unconstrained application of model checkers to verify planning domain models can result in false positives, i.e.counterexamples that are unreachable by a sound planner when using the domain under verification during a planning task. In this paper, we discuss the downside of unconstrained planning domain model verification. We then introduce the notion of a valid planning counterexample, and demonstrate how model checkers, as well as state trajectory constraints planning techniques, should be used to verify planning domain models so that invalid planning counterexamples are not returned.
SEJun 27, 2017
The IoT energy challenge: A software perspectiveKyriakos Georgiou, Samuel Xavier-de-Souza, Kerstin Eder
The Internet of Things (IoT) sparks a whole new world of embedded applications. Most of these applications are based on deeply embedded systems that have to operate on limited or unreliable sources of energy, such as batteries or energy harvesters. Meeting the energy requirements for such applications is a hard challenge, which threatens the future growth of the IoT. Software has the ultimate control over hardware. Therefore, its role is significant in optimizing the energy consumption of a system. Currently, programmers have no feedback on how their software affects the energy consumption of a system. Such feedback can be enabled by energy transparency, a concept that makes a program's energy consumption visible, from hardware to software. This paper discusses the need for energy transparency in software development and emphasizes on how such transparency can be realized to help tackling the IoT energy challenge.
ROMar 7, 2017
Effects of Faults, Experience, and Personality on Trust in a Robot Co-WorkerSatragni Sarkar, Dejanira Araiza-Illan, Kerstin Eder
To design trustworthy robots, we need to understand the impact factors of trust: people's attitudes, experience, and characteristics; the robot's physical design, reliability, and performance; a task's specification and the circumstances under which it is to be performed, e.g. at leisure or under time pressure. As robots are used for a wide variety of tasks and applications, robot designers ought to be provided with evidence and guidance, to inform their decisions to achieve safe, trustworthy and efficient human-robot interactions. In this work, the impact factors of trust in a collaborative manufacturing scenario are studied by conducting an experiment with a real robot and participants where a physical object was assembled and then disassembled. Objective and subjective measures were employed to evaluate the development of trust, under faulty and non-faulty robot conditions, and the effect of previous experience with robots, and personality traits. Our findings highlight differences when compared to other, more social, scenarios with robotic assistants (such as a home care assistant), in that the condition (faulty or not) does not have a significant impact on the human's perception of the robot in terms of human-likeliness, likeability, trustworthiness, and even competence. However, personality and previous experience do have an effect on how the robot is perceived by participants, even though that is relatively small.
ROFeb 22, 2017
Risk-based Triggering of Bio-inspired Self-Preservation to Protect Robots from ThreatsSing-Kai Chiu, Dejanira Araiza-Illan, Kerstin Eder
Safety in autonomous systems has been mostly studied from a human-centered perspective. Besides the loads they may carry, autonomous systems are also valuable property, and self-preservation mechanisms are needed to protect them in the presence of external threats, including malicious robots and antagonistic humans. We present a biologically inspired risk-based triggering mechanism to initiate self-preservation strategies. This mechanism considers environmental and internal system factors to measure the overall risk at any moment in time, to decide whether behaviours such as fleeing or hiding are necessary, or whether the system should continue on its task. We integrated our risk-based triggering mechanism into a delivery rover that is being attacked by a drone and evaluated its effectiveness through systematic testing in a simulated environment in Robot Operating System (ROS) and Gazebo, with a variety of different randomly generated conditions. We compared the use of the triggering mechanism and different configurations of self-preservation behaviours to not having any of these. Our results show that triggering self-preservation increases the distance between the drone and the rover for many of these configurations, and, in some instances, the drone does not catch up with the rover. Our study demonstrates the benefits of embedding risk awareness and self-preservation into autonomous systems to increase their robustness, and the value of using bio-inspired engineering to find solutions in this area.
AISep 16, 2016
Model-based Test Generation for Robotic Software: Automata versus Belief-Desire-Intention AgentsDejanira Araiza-Illan, Anthony G. Pipe, Kerstin Eder
Robotic code needs to be verified to ensure its safety and functional correctness, especially when the robot is interacting with people. Testing real code in simulation is a viable option. However, generating tests that cover rare scenarios, as well as exercising most of the code, is a challenge amplified by the complexity of the interactions between the environment and the software. Model-based test generation methods can automate otherwise manual processes and facilitate reaching rare scenarios during testing. In this paper, we compare using Belief-Desire-Intention (BDI) agents as models for test generation with more conventional automata-based techniques that exploit model checking, in terms of practicality, performance, transferability to different scenarios, and exploration (`coverage'), through two case studies: a cooperative manufacturing task, and a home care scenario. The results highlight the advantages of using BDI agents for test generation. BDI agents naturally emulate the agency present in Human-Robot Interactions (HRIs), and are thus more expressive than automata. The performance of the BDI-based test generation is at least as high, and the achieved coverage is higher or equivalent, compared to test generation based on model checking automata.
ROAug 26, 2016
A Corroborative Approach to Verification and Validation of Human--Robot TeamsMatt Webster, David Western, Dejanira Araiza-Illan et al.
We present an approach for the verification and validation (V&V) of robot assistants in the context of human-robot interactions (HRI), to demonstrate their trustworthiness through corroborative evidence of their safety and functional correctness. Key challenges include the complex and unpredictable nature of the real world in which assistant and service robots operate, the limitations on available V&V techniques when used individually, and the consequent lack of confidence in the V&V results. Our approach, called corroborative V&V, addresses these challenges by combining several different V&V techniques; in this paper we use formal verification (model checking), simulation-based testing, and user validation in experiments with a real robot. We demonstrate our corroborative V&V approach through a handover task, the most critical part of a complex cooperative manufacturing scenario, for which we propose some safety and liveness requirements to verify and validate. We construct formal models, simulations and an experimental test rig for the HRI. To capture requirements we use temporal logic properties, assertion checkers and textual descriptions. This combination of approaches allows V&V of the HRI task at different levels of modelling detail and thoroughness of exploration, thus overcoming the individual limitations of each technique. Should the resulting V&V evidence present discrepancies, an iterative process between the different V&V techniques takes place until corroboration between the V&V techniques is gained from refining and improving the assets (i.e., system and requirement models) to represent the HRI task in a more truthful manner. Therefore, corroborative V&V affords a systematic approach to 'meta-V&V,' in which different V&V techniques can be used to corroborate and check one another, increasing the level of certainty in the results of V&V.
SEJun 3, 2016
A Fuzzy Approach to Qualification in Design Exploration for Autonomous Robots and SystemsJeremy Morse, Dejanira Araiza-Illan, Jonathan Lawry et al.
Autonomous robots must operate in complex and changing environments subject to requirements on their behaviour. Verifying absolute satisfaction (true or false) of these requirements is challenging. Instead, we analyse requirements that admit flexible degrees of satisfaction. We analyse vague requirements using fuzzy logic, and probabilistic requirements using model checking. The resulting analysis method provides a partial ordering of system designs, identifying trade-offs between different requirements in terms of the degrees to which they are satisfied. A case study involving a home care robot interacting with a human is used to demonstrate the approach.
ROMay 27, 2016
Believing in BERT: Using expressive communication to enhance trust and counteract operational error in physical Human-Robot InteractionAdriana Hamacher, Nadia Bianchi-Berthouze, Anthony G. Pipe et al.
Strategies are necessary to mitigate the impact of unexpected behavior in collaborative robotics, and research to develop solutions is lacking. Our aim here was to explore the benefits of an affective interaction, as opposed to a more efficient, less error prone but non-communicative one. The experiment took the form of an omelet-making task, with a wide range of participants interacting directly with BERT2, a humanoid robot assistant. Having significant implications for design, results suggest that efficiency is not the most important aspect of performance for users; a personable, expressive robot was found to be preferable over a more efficient one, despite a considerable trade off in time taken to perform the task. Our findings also suggest that a robot exhibiting human-like characteristics may make users reluctant to 'hurt its feelings'; they may even lie in order to avoid this.
ROApr 19, 2016
Intelligent Agent-Based Stimulation for Testing Robotic Software in Human-Robot InteractionsDejanira Araiza-Illan, Anthony G. Pipe, Kerstin Eder
The challenges of robotic software testing extend beyond conventional software testing. Valid, realistic and interesting tests need to be generated for multiple programs and hardware running concurrently, deployed into dynamic environments with people. We investigate the use of Belief-Desire-Intention (BDI) agents as models for test generation, in the domain of human-robot interaction (HRI) in simulations. These models provide rational agency, causality, and a reasoning mechanism for planning, which emulate both intelligent and adaptive robots, as well as smart testing environments directed by humans. We introduce reinforcement learning (RL) to automate the exploration of the BDI models using a reward function based on coverage feedback. Our approach is evaluated using a collaborative manufacture example, where the robotic software under test is stimulated indirectly via a simulated human co-worker. We conclude that BDI agents provide intuitive models for test generation in the HRI domain. Our results demonstrate that RL can fully automate BDI model exploration, leading to very effective coverage-directed test generation.
ROMar 3, 2016
Formal Specification and Analysis of Autonomous Systems under Partial ComplianceJeremy Morse, Dejanira Araiza-Illan, Jonathan Lawry et al.
The widespread adoption of autonomous systems depends on providing guarantees of safety and functional correctness, at both design time and runtime. Information about the extent to which functional requirements can be met in combination with non-functional requirements (NFRs) -- i.e. requirements that can be partially complied with -- , under dynamic and uncertain environments, provides opportunities to enhance the safety and functional correctness of systems at design time. We present a technique to formally define system attributes that can change or be changed to deal with dynamic and uncertain environments (denominated weakened specifications) as a partially ordered lattice, and to automatically explore the system under different specifications, using probabilistic model checking, to find the likelihood of satisfying a requirement. The resulting probabilities form boundaries of "optimal specifications", analogous to Pareto frontiers in multi-objective optimization, informing the designer about the system's capabilities, such as resilience or robustness, when changing its attributes to deal with dynamic and uncertain environments. We illustrate the proposed technique through a domestic robotic assistant example.
ROMar 2, 2016
Model-Based Testing, Using Belief-Desire-Intentions Agents, of Control Code for Robots in Collaborative Human-Robot InteractionsDejanira Araiza-Illan, Tony Pipe, Kerstin Eder
The software of robotic assistants needs to be verified, to ensure its safety and functional correctness. Testing in simulation allows a high degree of realism in the verification. However, generating tests that cover both interesting foreseen and unforeseen scenarios in human-robot interaction (HRI) tasks, while executing most of the code, remains a challenge. We propose the use of belief-desire-intention (BDI) agents in the test environment, to increase the level of realism and human-like stimulation of simulated robots. Artificial intelligence, such as agent theory, can be exploited for more intelligent test generation. An automated testbench was implemented for a simulation in Robot Operating System (ROS) and Gazebo, of a cooperative table assembly task between a humanoid robot and a person. Requirements were verified for this task, and some unexpected design issues were discovered, leading to possible code improvements. Our results highlight the practicality of BDI agents to automatically generate valid and human-like tests to get high code coverage, compared to hand-written directed tests, pseudorandom generation, and other variants of model-based test generation. Also, BDI agents allow the coverage of combined behaviours of the HRI system with more ease than writing temporal logic properties for model checking.
RONov 4, 2015
Systematic and Realistic Testing in Simulation of Control Code for Robots in Collaborative Human-Robot InteractionsDejanira Araiza-Illan, David Western, Anthony G. Pipe et al.
Industries such as flexible manufacturing and home care will be transformed by the presence of robotic assistants. Assurance of safety and functional soundness for these robotic systems will require rigorous verification and validation. We propose testing in simulation using Coverage-Driven Verification (CDV) to guide the testing process in an automatic and systematic way. We use a two-tiered test generation approach, where abstract test sequences are computed first and then concretized (e.g., data and variables are instantiated), to reduce the complexity of the test generation problem. To demonstrate the effectiveness of our approach, we developed a testbench for robotic code, running in ROS-Gazebo, that implements an object handover as part of a human-robot interaction (HRI) task. Tests are generated to stimulate the robot's code in a realistic manner, through stimulating the human, environment, sensors, and actuators in simulation. We compare the merits of unconstrained, constrained and model-based test generation in achieving thorough exploration of the code under test, and interesting combinations of human-robot interactions. Our results show that CDV combined with systematic test generation achieves a very high degree of automation in simulation-based verification of control code for robots in HRI.
ROSep 16, 2015
Coverage-Driven Verification - An approach to verify code for robots that directly interact with humansDejanira Araiza-Illan, David Western, Anthony Pipe et al.
Collaborative robots could transform several industries, such as manufacturing and healthcare, but they present a significant challenge to verification. The complex nature of their working environment necessitates testing in realistic detail under a broad range of circumstances. We propose the use of Coverage-Driven Verification (CDV) to meet this challenge. By automating the simulation-based testing process as far as possible, CDV provides an efficient route to coverage closure. We discuss the need, practical considerations, and potential benefits of transferring this approach from microelectronic design verification to the field of human-robot interaction. We demonstrate the validity and feasibility of the proposed approach by constructing a custom CDV testbench and applying it to the verification of an object handover task.
LOJul 20, 2015
Marimba: A Tool for Verifying Properties of Hidden Markov ModelsNoe Hernandez, Kerstin Eder, Evgeni Magid et al.
The formal verification of properties of Hidden Markov Models (HMMs) is highly desirable for gaining confidence in the correctness of the model and the corresponding system. A significant step towards HMM verification was the development by Zhang et al. of a family of logics for verifying HMMs, called POCTL*, and its model checking algorithm. As far as we know, the verification tool we present here is the first one based on Zhang et al.'s approach. As an example of its effective application, we verify properties of a handover task in the context of human-robot interaction. Our tool was implemented in Haskell, and the experimental evaluation was performed using the humanoid robot Bert2.
SYMay 21, 2015
Verification of Control Systems Implemented in Simulink with Assertion Checks and Theorem Proving: A Case StudyDejanira Araiza-Illan, Kerstin Eder, Arthur Richards
This paper presents the verification of control systems implemented in Simulink. The goal is to ensure that high-level requirements on control performance, like stability, are satisfied by the Simulink diagram. A two stage process is proposed. First, the high-level requirements are decomposed into specific parametrized sub-requirements and implemented as assertions in Simulink. Second, the verification takes place. On one hand, the sub-requirements are verified through assertion checks in simulation. On the other hand, according to their scope, some of the sub-requirements are verified through assertion checks in simulation, and others via automatic theorem proving over an ideal mathematical model of the diagram. We compare performing only assertion checks against the use of theorem proving, to highlight the advantages of the latter. Theorem proving performs verification by computing a mathematical proof symbolically, covering the entire state space of the variables. An automatic translation tool from Simulink to the language of the theorem proving tool Why3 is also presented. The paper demonstrates our approach by verifying the stability of a simple discrete linear system.
ROMay 21, 2015
Symmetry Reduction Enables Model Checking of More Complex Emergent Behaviours of Swarm Navigation AlgorithmsLaura Antuña, Dejanira Araiza-Illan, Sérgio Campos et al.
The emergent global behaviours of robotic swarms are important to achieve their navigation task goals. These emergent behaviours can be verified to assess their correctness, through techniques like model checking. Model checking exhaustively explores all possible behaviours, based on a discrete model of the system, such as a swarm in a grid. A common problem in model checking is the state-space explosion that arises when the states of the model are numerous. We propose a novel implementation of symmetry reduction, in the form of encoding navigation algorithms relatively with respect to a reference, based on the symmetrical properties of swarms in grids. We applied the relative encoding to a swarm navigation algorithm, Alpha, modelled for the NuSMV model checker. A comparison of the state-space and verification results with an absolute (or global) and a relative encoding of the Alpha algorithm highlights the advantages of our approach, allowing model checking larger grid sizes and number of robots, and consequently, verifying more complex emergent behaviours. For example, a property was verified for a grid with 3 robots and a maximum allowed size of 8x8 cells in a global encoding, whereas this size was increased to 16x16 using a relative encoding. Also, the time to verify a property for a swarm of 3 robots in a 6x6 grid was reduced from almost 10 hours to only 7 minutes. Our approach is transferable to other swarm navigation algorithms.
SEMay 5, 2015
Using Models at Runtime to Address Assurance for Self-Adaptive SystemsBetty Cheng, Kerstin Eder, Martin Gogolla et al.
A self-adaptive software system modifies its behavior at runtime in response to changes within the system or in its execution environment. The fulfillment of the system requirements needs to be guaranteed even in the presence of adverse conditions and adaptations. Thus, a key challenge for self-adaptive software systems is assurance. Traditionally, confidence in the correctness of a system is gained through a variety of activities and processes performed at development time, such as design analysis and testing. In the presence of selfadaptation, however, some of the assurance tasks may need to be performed at runtime. This need calls for the development of techniques that enable continuous assurance throughout the software life cycle. Fundamental to the development of runtime assurance techniques is research into the use of models at runtime (M@RT). This chapter explores the state of the art for usingM@RT to address the assurance of self-adaptive software systems. It defines what information can be captured by M@RT, specifically for the purpose of assurance, and puts this definition into the context of existing work. We then outline key research challenges for assurance at runtime and characterize assurance methods. The chapter concludes with an exploration of selected application areas where M@RT could provide significant benefits beyond existing assurance techniques for adaptive systems.
ROApr 8, 2014
Towards the Safety of Human-in-the-Loop Robotics: Challenges and Opportunities for Safety Assurance of Robotic Co-WorkersKerstin Eder, Chris Harper, Ute Leonards
The success of the human-robot co-worker team in a flexible manufacturing environment where robots learn from demonstration heavily relies on the correct and safe operation of the robot. How this can be achieved is a challenge that requires addressing both technical as well as human-centric research questions. In this paper we discuss the state of the art in safety assurance, existing as well as emerging standards in this area, and the need for new approaches to safety assurance in the context of learning machines. We then focus on robotic learning from demonstration, the challenges these techniques pose to safety assurance and indicate opportunities to integrate safety considerations into algorithms "by design". Finally, from a human-centric perspective, we stipulate that, to achieve high levels of safety and ultimately trust, the robotic co-worker must meet the innate expectations of the humans it works with. It is our aim to stimulate a discussion focused on the safety aspects of human-in-the-loop robotics, and to foster multidisciplinary collaboration to address the research challenges identified.