LGOct 3, 2022
Reward Learning with Trees: Methods and EvaluationTom Bewley, Jonathan Lawry, Arthur Richards et al.
Recent efforts to learn reward functions from human feedback have tended to use deep neural networks, whose lack of transparency hampers our ability to explain agent behaviour or verify alignment. We explore the merits of learning intrinsically interpretable tree models instead. We develop a recently proposed method for learning reward trees from preference labels, and show it to be broadly competitive with neural networks on challenging high-dimensional tasks, with good robustness to limited or corrupted data. Having found that reward tree learning can be done effectively in complex settings, we then consider why it should be used, demonstrating that the interpretable reward structure gives significant scope for traceability, verification and explanation.
AIMay 26, 2023
Learning Interpretable Models of Aircraft Handling Behaviour by Reinforcement Learning from Human FeedbackTom Bewley, Jonathan Lawry, Arthur Richards
We propose a method to capture the handling abilities of fast jet pilots in a software model via reinforcement learning (RL) from human preference feedback. We use pairwise preferences over simulated flight trajectories to learn an interpretable rule-based model called a reward tree, which enables the automated scoring of trajectories alongside an explanatory rationale. We train an RL agent to execute high-quality handling behaviour by using the reward tree as the objective, and thereby generate data for iterative preference collection and further refinement of both tree and agent. Experiments with synthetic preferences show reward trees to be competitive with uninterpretable neural network reward models on quantitative and qualitative evaluations.
AIJan 17, 2022
Summarising and Comparing Agent Dynamics with Contrastive Spatiotemporal AbstractionTom Bewley, Jonathan Lawry, Arthur Richards
We introduce a data-driven, model-agnostic technique for generating a human-interpretable summary of the salient points of contrast within an evolving dynamical system, such as the learning process of a control agent. It involves the aggregation of transition data along both spatial and temporal dimensions according to an information-theoretic divergence measure. A practical algorithm is outlined for continuous state spaces, and deployed to summarise the learning histories of deep reinforcement learning agents with the aid of graphical and textual communication methods. We expect our method to be complementary to existing techniques in the realm of agent interpretability.
AIJun 19, 2020
Modelling Agent Policies with Interpretable Imitation LearningTom Bewley, Jonathan Lawry, Arthur Richards
As we deploy autonomous agents in safety-critical domains, it becomes important to develop an understanding of their internal mechanisms and representations. We outline an approach to imitation learning for reverse-engineering black box agent policies in MDP environments, yielding simplified, interpretable models in the form of decision trees. As part of this process, we explicitly model and learn agents' latent state representations by selecting from a large space of candidate features constructed from the Markov state. We present initial promising results from an implementation in a multi-agent traffic environment.
SEJun 3, 2016
A Fuzzy Approach to Qualification in Design Exploration for Autonomous Robots and SystemsJeremy Morse, Dejanira Araiza-Illan, Jonathan Lawry et al.
Autonomous robots must operate in complex and changing environments subject to requirements on their behaviour. Verifying absolute satisfaction (true or false) of these requirements is challenging. Instead, we analyse requirements that admit flexible degrees of satisfaction. We analyse vague requirements using fuzzy logic, and probabilistic requirements using model checking. The resulting analysis method provides a partial ordering of system designs, identifying trade-offs between different requirements in terms of the degrees to which they are satisfied. A case study involving a home care robot interacting with a human is used to demonstrate the approach.
ROMar 3, 2016
Formal Specification and Analysis of Autonomous Systems under Partial ComplianceJeremy Morse, Dejanira Araiza-Illan, Jonathan Lawry et al.
The widespread adoption of autonomous systems depends on providing guarantees of safety and functional correctness, at both design time and runtime. Information about the extent to which functional requirements can be met in combination with non-functional requirements (NFRs) -- i.e. requirements that can be partially complied with -- , under dynamic and uncertain environments, provides opportunities to enhance the safety and functional correctness of systems at design time. We present a technique to formally define system attributes that can change or be changed to deal with dynamic and uncertain environments (denominated weakened specifications) as a partially ordered lattice, and to automatically explore the system under different specifications, using probabilistic model checking, to find the likelihood of satisfying a requirement. The resulting probabilities form boundaries of "optimal specifications", analogous to Pareto frontiers in multi-objective optimization, informing the designer about the system's capabilities, such as resilience or robustness, when changing its attributes to deal with dynamic and uncertain environments. We illustrate the proposed technique through a domestic robotic assistant example.
SYMay 21, 2015
Verification of Control Systems Implemented in Simulink with Assertion Checks and Theorem Proving: A Case StudyDejanira Araiza-Illan, Kerstin Eder, Arthur Richards
This paper presents the verification of control systems implemented in Simulink. The goal is to ensure that high-level requirements on control performance, like stability, are satisfied by the Simulink diagram. A two stage process is proposed. First, the high-level requirements are decomposed into specific parametrized sub-requirements and implemented as assertions in Simulink. Second, the verification takes place. On one hand, the sub-requirements are verified through assertion checks in simulation. On the other hand, according to their scope, some of the sub-requirements are verified through assertion checks in simulation, and others via automatic theorem proving over an ideal mathematical model of the diagram. We compare performing only assertion checks against the use of theorem proving, to highlight the advantages of the latter. Theorem proving performs verification by computing a mathematical proof symbolically, covering the entire state space of the variables. An automatic translation tool from Simulink to the language of the theorem proving tool Why3 is also presented. The paper demonstrates our approach by verifying the stability of a simple discrete linear system.