MLLGMASPSep 14, 2023

Deep Multi-Agent Reinforcement Learning for Decentralized Active Hypothesis Testing

arXiv:2309.08477v19 citationsh-index: 23Has Code
Originality Incremental advance
AI Analysis

This work addresses a complex multi-agent decision-making problem in hypothesis testing, offering a novel solution that could benefit fields like distributed sensing or robotics, though it is incremental as it extends deep learning methods to a multi-agent setting.

The paper tackles the decentralized active hypothesis testing problem with multiple agents by introducing the MARLA algorithm, which uses deep multi-agent reinforcement learning to minimize Bayes risk, and demonstrates improved performance over single-agent approaches in experiments.

We consider a decentralized formulation of the active hypothesis testing (AHT) problem, where multiple agents gather noisy observations from the environment with the purpose of identifying the correct hypothesis. At each time step, agents have the option to select a sampling action. These different actions result in observations drawn from various distributions, each associated with a specific hypothesis. The agents collaborate to accomplish the task, where message exchanges between agents are allowed over a rate-limited communications channel. The objective is to devise a multi-agent policy that minimizes the Bayes risk. This risk comprises both the cost of sampling and the joint terminal cost incurred by the agents upon making a hypothesis declaration. Deriving optimal structured policies for AHT problems is generally mathematically intractable, even in the context of a single agent. As a result, recent efforts have turned to deep learning methodologies to address these problems, which have exhibited significant success in single-agent learning scenarios. In this paper, we tackle the multi-agent AHT formulation by introducing a novel algorithm rooted in the framework of deep multi-agent reinforcement learning. This algorithm, named Multi-Agent Reinforcement Learning for AHT (MARLA), operates at each time step by having each agent map its state to an action (sampling rule or stopping rule) using a trained deep neural network with the goal of minimizing the Bayes risk. We present a comprehensive set of experimental results that effectively showcase the agents' ability to learn collaborative strategies and enhance performance using MARLA. Furthermore, we demonstrate the superiority of MARLA over single-agent learning approaches. Finally, we provide an open-source implementation of the MARLA framework, for the benefit of researchers and developers in related domains.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes