AI Agents for Inventory Control: Human-LLM-OR Complementarity

Jackie Baek, Yaopeng Fu, Will Ma, Tianyi Peng

arXiv:2602.12631v16.01 citationsh-index: 12

Originality Incremental advance

AI Analysis

This addresses inventory management challenges for businesses by showing complementary effects between traditional and AI methods, though it is incremental in combining existing techniques.

The paper tackled inventory control by integrating operations research algorithms, large language models, and humans, finding that OR-augmented LLM methods outperform isolated approaches and human-AI teams achieve higher profits than either alone, with a substantial fraction of individuals benefiting from collaboration.

Inventory control is a fundamental operations problem in which ordering decisions are traditionally guided by theoretically grounded operations research (OR) algorithms. However, such algorithms often rely on rigid modeling assumptions and can perform poorly when demand distributions shift or relevant contextual information is unavailable. Recent advances in large language models (LLMs) have generated interest in AI agents that can reason flexibly and incorporate rich contextual signals, but it remains unclear how best to incorporate LLM-based methods into traditional decision-making pipelines. We study how OR algorithms, LLMs, and humans can interact and complement each other in a multi-period inventory control setting. We construct InventoryBench, a benchmark of over 1,000 inventory instances spanning both synthetic and real-world demand data, designed to stress-test decision rules under demand shifts, seasonality, and uncertain lead times. Through this benchmark, we find that OR-augmented LLM methods outperform either method in isolation, suggesting that these methods are complementary rather than substitutes. We further investigate the role of humans through a controlled classroom experiment that embeds LLM recommendations into a human-in-the-loop decision pipeline. Contrary to prior findings that human-AI collaboration can degrade performance, we show that, on average, human-AI teams achieve higher profits than either humans or AI agents operating alone. Beyond this population-level finding, we formalize an individual-level complementarity effect and derive a distribution-free lower bound on the fraction of individuals who benefit from AI collaboration; empirically, we find this fraction to be substantial.

View on arXiv PDF

Similar