LGNov 8, 2021
On Assessing The Safety of Reinforcement Learning algorithms Using Formal MethodsPaulina Stevia Nouwou Mindom, Amin Nikanjam, Foutse Khomh et al.
The increasing adoption of Reinforcement Learning in safety-critical systems domains such as autonomous vehicles, health, and aviation raises the need for ensuring their safety. Existing safety mechanisms such as adversarial training, adversarial detection, and robust learning are not always adapted to all disturbances in which the agent is deployed. Those disturbances include moving adversaries whose behavior can be unpredictable by the agent, and as a matter of fact harmful to its learning. Ensuring the safety of critical systems also requires methods that give formal guarantees on the behaviour of the agent evolving in a perturbed environment. It is therefore necessary to propose new solutions adapted to the learning challenges faced by the agent. In this paper, first we generate adversarial agents that exhibit flaws in the agent's policy by presenting moving adversaries. Secondly, We use reward shaping and a modified Q-learning algorithm as defense mechanisms to improve the agent's policy when facing adversarial perturbations. Finally, probabilistic model checking is employed to evaluate the effectiveness of both mechanisms. We have conducted experiments on a discrete grid world with a single agent facing non-learning and learning adversaries. Our results show a diminution in the number of collisions between the agent and the adversaries. Probabilistic model checking provides lower and upper probabilistic bounds regarding the agent's safety in the adversarial environment.
CROct 14, 2015
Probabilistic Opacity in Refinement-Based ModelingBéatrice Bérard, Olga Kouchnarenko, John Mullins et al.
Given a probabilistic transition system (PTS) $\cal A$ partially observed by an attacker, and an $ω$-regular predicate $\varphi$over the traces of $\cal A$, measuring the disclosure of the secret $\varphi$ in $\cal A$ means computing the probability that an attacker who observes a run of $\cal A$ can ascertain that its trace belongs to $\varphi$. In the context of refinement, we consider specifications given as Interval-valued Discrete Time Markov Chains (IDTMCs), which are underspecified Markov chains where probabilities on edges are only required to belong to intervals. Scheduling an IDTMC $\cal S$ produces a concrete implementation as a PTS and we define the worst case disclosure of secret $\varphi$ in ${\cal S}$ as the maximal disclosure of $\varphi$ over all PTSs thus produced. We compute this value for a subclass of IDTMCs and we prove that refinement can only improve the opacity of implementations.
SEOct 12, 2014
A Time-Triggered Constraint-Based Calculus for Avionic SystemsSardaouna Hamadou, Abdelouahed Gherbi, John Mullins et al.
The Integrated Modular Avionics (IMA) architec- ture and the Time-Triggered Ethernet (TTEthernet) network have emerged as the key components of a typical architecture model for recent civil aircrafts. We propose a real-time constraint-based calculus targeted at the analysis of such concepts of avionic embedded systems. We show our framework at work on the modelisation of both the (IMA) architecture and the TTEthernet network, illustrating their behavior by the well-known Flight Management System (FMS).
CRSep 2, 2014
Verification of Information Flow Properties under Rational ObservationBéatrice Bérard, John Mullins
Information flow properties express the capability for an agent to infer information about secret behaviours of a partially observable system. In a language-theoretic setting, where the system behaviour is described by a language, we define the class of rational information flow properties (RIFP), where observers are modeled by finite transducers, acting on languages in a given family $\mathcal{L}$. This leads to a general decidability criterion for the verification problem of RIFPs on $\mathcal{L}$, implying PSPACE-completeness for this problem on regular languages. We show that most trace-based information flow properties studied up to now are RIFPs, including those related to selective declassification and conditional anonymity. As a consequence, we retrieve several existing decidability results that were obtained by ad-hoc proofs.
CRDec 22, 2013
Opacity with Orwellian Observers and Intransitive Non-interferenceJohn Mullins, Moez Yeddes
Opacity is a general behavioural security scheme flexible enough to account for several specific properties. Some secret set of behaviors of a system is opaque if a passive attacker can never tell whether the observed behavior is a secret one or not. Instead of considering the case of static observability where the set of observable events is fixed off line or dynamic observability where the set of observable events changes over time depending on the history of the trace, we consider Orwellian partial observability where unobservable events are not revealed unless a downgrading event occurs in the future of the trace. We show how to verify that some regular secret is opaque for a regular language L w.r.t. an Orwellian projection while it has been proved undecidable even for a regular language L w.r.t. a general Orwellian observation function. We finally illustrate relevancy of our results by proving the equivalence between the opacity property of regular secrets w.r.t. Orwellian projection and the intransitive non-interference property.