A Versatile Multi-Agent Reinforcement Learning Benchmark for Inventory Management
This work provides a benchmark for researchers to test MARL algorithms in inventory management, addressing domain-specific challenges, but it is incremental as it builds on existing MARL and OR methods without introducing new algorithms.
The authors tackled the challenge of applying multi-agent reinforcement learning (MARL) to real-world scenarios like inventory management by developing MABIM, a versatile benchmark simulator that generates tasks with scaling, complex interactions, and non-stationary dynamics, and they evaluated classic OR methods and popular MARL algorithms on these tasks to identify weaknesses and potential.
Multi-agent reinforcement learning (MARL) models multiple agents that interact and learn within a shared environment. This paradigm is applicable to various industrial scenarios such as autonomous driving, quantitative trading, and inventory management. However, applying MARL to these real-world scenarios is impeded by many challenges such as scaling up, complex agent interactions, and non-stationary dynamics. To incentivize the research of MARL on these challenges, we develop MABIM (Multi-Agent Benchmark for Inventory Management) which is a multi-echelon, multi-commodity inventory management simulator that can generate versatile tasks with these different challenging properties. Based on MABIM, we evaluate the performance of classic operations research (OR) methods and popular MARL algorithms on these challenging tasks to highlight their weaknesses and potential.