Model based Multi-agent Reinforcement Learning with Tensor Decompositions
This is an incremental approach for multi-agent reinforcement learning researchers, focusing on improving generalization in complex environments.
The paper tackles the challenge of generalizing over intractable state-action spaces in multi-agent reinforcement learning by modeling transition and reward functions as low-rank tensors, with initial experiments on synthetic MDPs showing faster convergence when the true functions are low-rank.
A challenge in multi-agent reinforcement learning is to be able to generalize over intractable state-action spaces. Inspired from Tesseract [Mahajan et al., 2021], this position paper investigates generalisation in state-action space over unexplored state-action pairs by modelling the transition and reward functions as tensors of low CP-rank. Initial experiments on synthetic MDPs show that using tensor decompositions in a model-based reinforcement learning algorithm can lead to much faster convergence if the true transition and reward functions are indeed of low rank.