MARLIM: Multi-Agent Reinforcement Learning for Inventory Management
This addresses inventory optimization for supply chain industries, but it is incremental as it applies known RL methods to a specific domain.
The paper tackles the inventory management problem for a multi-product supply chain with stochastic demands and lead-times by introducing MARLIM, a multi-agent reinforcement learning framework, and shows that it outperforms traditional baselines in numerical experiments on real data.
Maintaining a balance between the supply and demand of products by optimizing replenishment decisions is one of the most important challenges in the supply chain industry. This paper presents a novel reinforcement learning framework called MARLIM, to address the inventory management problem for a single-echelon multi-products supply chain with stochastic demands and lead-times. Within this context, controllers are developed through single or multiple agents in a cooperative setting. Numerical experiments on real data demonstrate the benefits of reinforcement learning methods over traditional baselines.