LGAICEJan 14, 2025

Data-driven inventory management for new products: An adjusted Dyna-$Q$ approach with transfer learning

arXiv:2501.08109v42025 IEEE 21st International Conference on Automation Science and Engineering (CASE)
Originality Incremental advance
AI Analysis

This addresses inventory management challenges for businesses launching new products, offering a practical solution with measurable improvements, though it is incremental as it builds on existing Dyna-Q and transfer learning methods.

The paper tackles inventory management for new products without historical demand data by proposing an adjusted Dyna-Q reinforcement learning algorithm with transfer learning, achieving up to a 23.7% reduction in average daily cost compared to Q-learning and up to a 77.5% reduction in training time compared to classic Dyna-Q.

In this paper, we propose a novel reinforcement learning algorithm for inventory management of newly launched products with no historical demand information. The algorithm follows the classic Dyna-$Q$ structure, balancing the model-free and model-based approaches, while accelerating the training process of Dyna-$Q$ and mitigating the model discrepancy generated by the model-based feedback. Based on the idea of transfer learning, warm-start information from the demand data of existing similar products can be incorporated into the algorithm to further stabilize the early-stage training and reduce the variance of the estimated optimal policy. Our approach is validated through a case study of bakery inventory management with real data. The adjusted Dyna-$Q$ shows up to a 23.7\% reduction in average daily cost compared with $Q$-learning, and up to a 77.5\% reduction in training time within the same horizon compared with classic Dyna-$Q$. By using transfer learning, it can be found that the adjusted Dyna-$Q$ has the lowest total cost, lowest variance in total cost, and relatively low shortage percentages among all the benchmarking algorithms under a 30-day testing.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes