A Comparative Study on Forecasting of Retail Sales
This incremental study addresses sales prediction challenges for retail companies, providing practical insights into model trade-offs.
The paper benchmarks time-series forecasting models on Walmart sales data from the M5 Kaggle dataset, finding that ARIMA outperforms Facebook Prophet and LightGBM in accuracy, while LightGBM offers significant computational gains with minimal accuracy loss.
Predicting product sales of large retail companies is a challenging task considering volatile nature of trends, seasonalities, events as well as unknown factors such as market competitions, change in customer's preferences, or unforeseen events, e.g., COVID-19 outbreak. In this paper, we benchmark forecasting models on historical sales data from Walmart to predict their future sales. We provide a comprehensive theoretical overview and analysis of the state-of-the-art timeseries forecasting models. Then, we apply these models on the forecasting challenge dataset (M5 forecasting by Kaggle). Specifically, we use a traditional model, namely, ARIMA (Autoregressive Integrated Moving Average), and recently developed advanced models e.g., Prophet model developed by Facebook, light gradient boosting machine (LightGBM) model developed by Microsoft and benchmark their performances. Results suggest that ARIMA model outperforms the Facebook Prophet and LightGBM model while the LightGBM model achieves huge computational gain for the large dataset with negligible compromise in the prediction accuracy.