Aquib Mustafa

AIMar 23, 2021

Assured Learning-enabled Autonomy: A Metacognitive Reinforcement Learning Framework

Aquib Mustafa, Majid Mazouchi, Subramanya Nageshrao et al.

Reinforcement learning (RL) agents with pre-specified reward functions cannot provide guaranteed safety across variety of circumstances that an uncertain system might encounter. To guarantee performance while assuring satisfaction of safety constraints across variety of circumstances, an assured autonomous control framework is presented in this paper by empowering RL algorithms with metacognitive learning capabilities. More specifically, adapting the reward function parameters of the RL agent is performed in a metacognitive decision-making layer to assure the feasibility of RL agent. That is, to assure that the learned policy by the RL agent satisfies safety constraints specified by signal temporal logic while achieving as much performance as possible. The metacognitive layer monitors any possible future safety violation under the actions of the RL agent and employs a higher-layer Bayesian RL algorithm to proactively adapt the reward function for the lower-layer RL agent. To minimize the higher-layer Bayesian RL intervention, a fitness function is leveraged by the metacognitive layer as a metric to evaluate success of the lower-layer RL agent in satisfaction of safety and liveness specifications, and the higher-layer Bayesian RL intervenes only if there is a risk of lower-layer RL failure. Finally, a simulation example is provided to validate the effectiveness of the proposed approach.

SYMay 14, 2019

Attack Analysis and Resilient Control Design for Discrete-time Distributed Multi-agent Systems

Aquib Mustafa, Hamidreza Modares

This work presents a rigorous analysis of the adverse effects of cyber-physical attacks on discrete-time distributed multi-agent systems, and propose a mitigation approach for attacks on sensors and actuators. First, we show how an attack on a compromised agent can propagate and affect intact agents that are reachable from it. This is, an attack on a single node snowballs into a network-wide attack and can even destabilize the entire system. Moreover, we show that the attacker can bypass the robust $H_{\infty}$ control protocol and make it entirely ineffective in attenuating the effect of the adversarial input on the system performance. Finally, to overcome adversarial effects of attacks on sensors and actuators, a distributed adaptive attack compensator is designed by estimating the normal expected behavior of agents. The adaptive attack compensator is augmented with the controller and it is shown that the proposed controller achieves secure consensus in presence of the attacks on sensors and actuators. This controller does not require to make any restrictive assumption on the number of agents or agent's neighbors under direct effect of adversarial input. Moreover, it recovers compromised agents under actuator attacks and avoids propagation of attacks on sensors without removing compromised agents. The effectiveness of the proposed controller and analysis is validated on a network of Sentry autonomous underwater vehicles subject to attacks under different scenarios.

Aquib Mustafa

2 Papers