LG AI MAMay 27, 2022

FedFormer: Contextual Federation with Attention in Reinforcement Learning

Liam Hebert, Lukasz Golab, Pascal Poupart, Robin Cohen

arXiv:2205.13697v310.413 citationsh-index: 48Has Code

Originality Incremental advance

AI Analysis

This addresses multi-agent federated reinforcement learning for improved efficiency and effectiveness, though it appears incremental as it builds on existing federation strategies with a novel aggregation method.

The paper tackles the problem of aggregating insights from multiple agents in federated reinforcement learning by proposing FedFormer, which uses Transformer attention to contextually aggregate embeddings instead of averaging weights. The approach achieves higher episodic returns than FedAvg and non-federated Soft Actor-Critic methods in Meta-World evaluations while maintaining privacy constraints.

A core issue in multi-agent federated reinforcement learning is defining how to aggregate insights from multiple agents. This is commonly done by taking the average of each participating agent's model weights into one common model (FedAvg). We instead propose FedFormer, a novel federation strategy that utilizes Transformer Attention to contextually aggregate embeddings from models originating from different learner agents. In so doing, we attentively weigh the contributions of other agents with respect to the current agent's environment and learned relationships, thus providing a more effective and efficient federation. We evaluate our methods on the Meta-World environment and find that our approach yields significant improvements over FedAvg and non-federated Soft Actor-Critic single-agent methods. Our results compared to Soft Actor-Critic show that FedFormer achieves higher episodic return while still abiding by the privacy constraints of federated learning. Finally, we also demonstrate improvements in effectiveness with increased agent pools across all methods in certain tasks. This is contrasted by FedAvg, which fails to make noticeable improvements when scaled.

View on arXiv PDF Code

Similar