Deep Q-Network Based Multi-agent Reinforcement Learning with Binary Action Agents
This addresses training efficiency and performance issues in multi-agent reinforcement learning for researchers and practitioners, though it appears incremental as it builds on existing DQN methods.
The paper tackles the complexity and training difficulties in Deep Q-Network (DQN) based multi-agent reinforcement learning systems by proposing a simpler approach with shared state and rewards but agent-specific actions, which achieves faster convergence and better performance on tasks like Cartpole-v1, LunarLander-v2, and Maze Traversal compared to baselines.
Deep Q-Network (DQN) based multi-agent systems (MAS) for reinforcement learning (RL) use various schemes where in the agents have to learn and communicate. The learning is however specific to each agent and communication may be satisfactorily designed for the agents. As more complex Deep QNetworks come to the fore, the overall complexity of the multi-agent system increases leading to issues like difficulty in training, need for higher resources and more training time, difficulty in fine-tuning, etc. To address these issues we propose a simple but efficient DQN based MAS for RL which uses shared state and rewards, but agent-specific actions, for updation of the experience replay pool of the DQNs, where each agent is a DQN. The benefits of the approach are overall simplicity, faster convergence and better performance as compared to conventional DQN based approaches. It should be noted that the method can be extended to any DQN. As such we use simple DQN and DDQN (Double Q-learning) respectively on three separate tasks i.e. Cartpole-v1 (OpenAI Gym environment) , LunarLander-v2 (OpenAI Gym environment) and Maze Traversal (customized environment). The proposed approach outperforms the baseline on these tasks by decent margins respectively.