Hassam Ullah Sheikh

5papers

57citations

Novelty38%

AI Score20

Ranked #193,264 of 205,806 authors (top 94%)#194 in MA (top 93%)

5 Papers

LGJun 24, 2020

Preventing Value Function Collapse in Ensemble {Q}-Learning by Maximizing Representation Diversity

Hassam Ullah Sheikh, Ladislau Bölöni

The classic DQN algorithm is limited by the overestimation bias of the learned Q-function. Subsequent algorithms have proposed techniques to reduce this problem, without fully eliminating it. Recently, the Maxmin and Ensemble Q-learning algorithms have used different estimates provided by the ensembles of learners to reduce the overestimation bias. Unfortunately, these learners can converge to the same point in the parametric or representation space, falling back to the classic single neural network DQN. In this paper, we describe a regularization technique to maximize ensemble diversity in these algorithms. We propose and compare five regularization functions inspired from economics theory and consensus optimization. We show that the regularized approach significantly outperforms the Maxmin and Ensemble Q-learning algorithms as well as non-ensemble baselines.

CRJun 4, 2020

Automatic Feature Extraction, Categorization and Detection of Malicious Code in Android Applications

Muhammad Zuhair Qadir, Atif Nisar Jilani, Hassam Ullah Sheikh

Since Android has become a popular software platform for mobile devices recently; they offer almost the same functionality as personal computers. Malwares have also become a big concern. As the number of new Android applications tends to be rapidly increased in the near future, there is a need for automatic malware detection quickly and efficiently. In this paper, we define a simple static analysis approach to first extract the features of the android application based on intents and categories the application into a known major category and later on mapping it with the permissions requested by the application and also comparing it with the most obvious intents of category. As a result, getting to know which apps are using features which they are not supposed to use or they do not need.

MAMar 24, 2020

Multi-Agent Reinforcement Learning for Problems with Combined Individual and Team Reward

Hassam Ullah Sheikh, Ladislau Bölöni

Many cooperative multi-agent problems require agents to learn individual tasks while contributing to the collective success of the group. This is a challenging task for current state-of-the-art multi-agent reinforcement algorithms that are designed to either maximize the global reward of the team or the individual local rewards. The problem is exacerbated when either of the rewards is sparse leading to unstable learning. To address this problem, we present Decomposed Multi-Agent Deep Deterministic Policy Gradient (DE-MADDPG): a novel cooperative multi-agent reinforcement learning framework that simultaneously learns to maximize the global and local rewards. We evaluate our solution on the challenging defensive escort team problem and show that our solution achieves a significantly better and more stable performance than the direct adaptation of the MADDPG algorithm.

MAAug 24, 2019

Universal Policies to Learn Them All

Hassam Ullah Sheikh, Ladislau Bölöni

We explore a collaborative and cooperative multi-agent reinforcement learning setting where a team of reinforcement learning agents attempt to solve a single cooperative task in a multi-scenario setting. We propose a novel multi-agent reinforcement learning algorithm inspired by universal value function approximators that not only generalizes over state space but also over a set of different scenarios. Additionally, to prove our claim, we are introducing a challenging 2D multi-agent urban security environment where the learning agents are trying to protect a person from nearby bystanders in a variety of scenarios. Our study shows that state-of-the-art multi-agent reinforcement learning algorithms fail to generalize a single task over multiple scenarios while our proposed solution works equally well as scenario-dependent policies.

MAJan 28, 2019

Designing a Multi-Objective Reward Function for Creating Teams of Robotic Bodyguards Using Deep Reinforcement Learning

Hassam Ullah Sheikh, Ladislau Bölöni

We are considering a scenario where a team of bodyguard robots provides physical protection to a VIP in a crowded public space. We use deep reinforcement learning to learn the policy to be followed by the robots. As the robot bodyguards need to follow several difficult-to-reconcile goals, we study several primitive and composite reward functions and their impact on the overall behavior of the robotic bodyguards.