Interpretable Reinforcement Learning via Neural Additive Models for Inventory Management
This addresses the need for interpretable and flexible inventory management for supply chain managers, though it is incremental as it builds on existing interpretable and reinforcement learning techniques.
The paper tackled the problem of developing dynamic inventory ordering policies for multi-echelon supply chains, which are crucial for reacting to changes like those during the COVID-19 pandemic, and proposed an interpretable reinforcement learning approach using Neural Additive Models that is competitive with standard methods.
The COVID-19 pandemic has highlighted the importance of supply chains and the role of digital management to react to dynamic changes in the environment. In this work, we focus on developing dynamic inventory ordering policies for a multi-echelon, i.e. multi-stage, supply chain. Traditional inventory optimization methods aim to determine a static reordering policy. Thus, these policies are not able to adjust to dynamic changes such as those observed during the COVID-19 crisis. On the other hand, conventional strategies offer the advantage of being interpretable, which is a crucial feature for supply chain managers in order to communicate decisions to their stakeholders. To address this limitation, we propose an interpretable reinforcement learning approach that aims to be as interpretable as the traditional static policies while being as flexible and environment-agnostic as other deep learning-based reinforcement learning solutions. We propose to use Neural Additive Models as an interpretable dynamic policy of a reinforcement learning agent, showing that this approach is competitive with a standard full connected policy. Finally, we use the interpretability property to gain insights into a complex ordering strategy for a simple, linear three-echelon inventory supply chain.