Ammar Fayad

LG
h-index2
5papers
10citations
Novelty46%
AI Score36

5 Papers

LGDec 14, 2022
Hierarchical Strategies for Cooperative Multi-Agent Reinforcement Learning

Majd Ibrahim, Ammar Fayad

Adequate strategizing of agents behaviors is essential to solving cooperative MARL problems. One intuitively beneficial yet uncommon method in this domain is predicting agents future behaviors and planning accordingly. Leveraging this point, we propose a two-level hierarchical architecture that combines a novel information-theoretic objective with a trajectory prediction model to learn a strategy. To this end, we introduce a latent policy that learns two types of latent strategies: individual $z_A$, and relational $z_R$ using a modified Graph Attention Network module to extract interaction features. We encourage each agent to behave according to the strategy by conditioning its local $Q$ functions on $z_A$, and we further equip agents with a shared $Q$ function that conditions on $z_R$. Additionally, we introduce two regularizers to allow predicted trajectories to be accurate and rewarding. Empirical results on Google Research Football (GRF) and StarCraft (SC) II micromanagement tasks show that our method establishes a new state of the art being, to the best of our knowledge, the first MARL algorithm to solve all super hard SC II scenarios as well as the GRF full game with a win rate higher than $95\%$, thus outperforming all existing methods. Videos and brief overview of the methods and results are available at: https://sites.google.com/view/hier-strats-marl/home.

QUANT-PHMar 6
Quantum Diffusion Models: Score Reversal Is Not Free in Gaussian Dynamics

Ammar Fayad

Diffusion-based generative modeling suggests reversing a noising semigroup by adding a score drift. For continuous-variable Gaussian Markov dynamics, complete positivity couples drift and diffusion at the generator level. For a quantum-limited attenuator with thermal parameter $ν$ and squeezing $r$, the fixed-diffusion Wigner-score (Bayes) reverse drift violates CP iff $\cosh(2r)>ν$. Any Gaussian CP repair must inject extra diffusion, implying $-2\ln F\ge c_{\text{geom}}(ν_{\min})I_{\mathrm{dec}}^{\mathrm{wc}}$.

GR-QCNov 29, 2024
Unsupervised Learning Approach to Anomaly Detection in Gravitational Wave Data

Ammar Fayad

Gravitational waves (GW), predicted by Einstein's General Theory of Relativity, provide a powerful probe of astrophysical phenomena and fundamental physics. In this work, we propose an unsupervised anomaly detection method using variational autoencoders (VAEs) to analyze GW time-series data. By training on noise-only data, the VAE accurately reconstructs noise inputs while failing to reconstruct anomalies, such as GW signals, which results in measurable spikes in the reconstruction error. The method was applied to data from the LIGO H1 and L1 detectors. Evaluation on testing datasets containing both noise and GW events demonstrated reliable detection, achieving an area under the ROC curve (AUC) of 0.89. This study introduces VAEs as a robust, unsupervised approach for identifying anomalies in GW data, which offers a scalable framework for detecting known and potentially new phenomena in physics.

LGAug 28, 2021
Influence-Based Reinforcement Learning for Intrinsically-Motivated Agents

Ammar Fayad, Majd Ibrahim

Discovering successful coordinated behaviors is a central challenge in Multi-Agent Reinforcement Learning (MARL) since it requires exploring a joint action space that grows exponentially with the number of agents. In this paper, we propose a mechanism for achieving sufficient exploration and coordination in a team of agents. Specifically, agents are rewarded for contributing to a more diversified team behavior by employing proper intrinsic motivation functions. To learn meaningful coordination protocols, we structure agents' interactions by introducing a novel framework, where at each timestep, an agent simulates counterfactual rollouts of its policy and, through a sequence of computations, assesses the gap between other agents' current behaviors and their targets. Actions that minimize the gap are considered highly influential and are rewarded. We evaluate our approach on a set of challenging tasks with sparse rewards and partial observability that require learning complex cooperative strategies under a proper exploration scheme, such as the StarCraft Multi-Agent Challenge. Our methods show significantly improved performances over different baselines across all tasks.

LGApr 9, 2021
Behavior-Guided Actor-Critic: Improving Exploration via Learning Policy Behavior Representation for Deep Reinforcement Learning

Ammar Fayad, Majd Ibrahim

In this work, we propose Behavior-Guided Actor-Critic (BAC), an off-policy actor-critic deep RL algorithm. BAC mathematically formulates the behavior of the policy through autoencoders by providing an accurate estimation of how frequently each state-action pair was visited while taking into consideration state dynamics that play a crucial role in determining the trajectories produced by the policy. The agent is encouraged to change its behavior consistently towards less-visited state-action pairs while attaining good performance by maximizing the expected discounted sum of rewards, resulting in an efficient exploration of the environment and good exploitation of all high reward regions. One prominent aspect of our approach is that it is applicable to both stochastic and deterministic actors in contrast to maximum entropy deep reinforcement learning algorithms. Results show considerably better performances of BAC when compared to several cutting-edge learning algorithms.