OCAug 3, 2024
Deep Reinforcement Learning for Dynamic Order Picking in Warehouse OperationsSasan Mahmoudinazlou, Abhay Sobhanan, Hadi Charkhgard et al.
Order picking is a pivotal operation in warehouses that directly impacts overall efficiency and profitability. This study addresses the dynamic order picking problem, a significant concern in modern warehouse management, where real-time adaptation to fluctuating order arrivals and efficient picker routing are crucial. Traditional methods, which often depend on static optimization algorithms designed around fixed order sets for the picker routing, fall short in addressing the challenges of this dynamic environment. To overcome these challenges, we propose a Deep Reinforcement Learning (DRL) framework tailored for single-block warehouses equipped with an autonomous picking device. By dynamically optimizing picker routes, our approach significantly reduces order throughput times and unfulfilled orders, particularly under high order arrival rates. We benchmark our DRL model against established algorithms, utilizing instances generated based on standard practices in the order picking literature. Experimental results demonstrate the superiority of our DRL model over benchmark algorithms. For example, at a high order arrival rate of 0.09 (i.e., 9 orders per 100 units of time on average), our approach achieves an order fulfillment rate of approximately 98%, compared to the 82% fulfillment rate observed with benchmarking algorithms. We further investigate the integration of a hyperparameter in the reward function that allows for flexible balancing between distance traveled and order completion time. Finally, we demonstrate the robustness of our DRL model on out-of-sample test instances.
LGDec 1, 2022
A Hybrid Statistical-Machine Learning Approach for Analysing Online Customer Behavior: An Empirical StudySaed Alizamir, Kasun Bandara, Ali Eshragh et al.
We apply classical statistical methods in conjunction with the state-of-the-art machine learning techniques to develop a hybrid interpretable model to analyse 454,897 online customers' behavior for a particular product category at the largest online retailer in China, that is JD. While most mere machine learning methods are plagued by the lack of interpretability in practice, our novel hybrid approach will address this practical issue by generating explainable output. This analysis involves identifying what features and characteristics have the most significant impact on customers' purchase behavior, thereby enabling us to predict future sales with a high level of accuracy, and identify the most impactful variables. Our results reveal that customers' product choice is insensitive to the promised delivery time, but this factor significantly impacts customers' order quantity. We also show that the effectiveness of various discounting methods depends on the specific product and the discount size. We identify product classes for which certain discounting approaches are more effective and provide recommendations on better use of different discounting tools. Customers' choice behavior across different product classes is mostly driven by price, and to a lesser extent, by customer demographics. The former finding asks for exercising care in deciding when and how much discount should be offered, whereas the latter identifies opportunities for personalized ads and targeted marketing. Further, to curb customers' batch ordering behavior and avoid the undesirable Bullwhip effect, JD should improve its logistics to ensure faster delivery of orders.
MLDec 30, 2023
SALSA: Sequential Approximate Leverage-Score Algorithm with Application in Analyzing Big Time Series DataAli Eshragh, Luke Yerbury, Asef Nazari et al.
We develop a new efficient sequential approximate leverage score algorithm, SALSA, using methods from randomized numerical linear algebra (RandNLA) for large matrices. We demonstrate that, with high probability, the accuracy of SALSA's approximations is within $(1 + O({\varepsilon}))$ of the true leverage scores. In addition, we show that the theoretical computational complexity and numerical accuracy of SALSA surpass existing approximations. These theoretical results are subsequently utilized to develop an efficient algorithm, named LSARMA, for fitting an appropriate ARMA model to large-scale time series data. Our proposed algorithm is, with high probability, guaranteed to find the maximum likelihood estimates of the parameters for the true underlying ARMA model. Furthermore, it has a worst-case running time that significantly improves those of the state-of-the-art alternatives in big data regimes. Empirical results on large-scale data strongly support these theoretical results and underscore the efficacy of our new approach.
LGFeb 5, 2024
Deep Reinforcement Learning for Picker Routing Problem in WarehousingGeorge Dunn, Hadi Charkhgard, Ali Eshragh et al.
Order Picker Routing is a critical issue in Warehouse Operations Management. Due to the complexity of the problem and the need for quick solutions, suboptimal algorithms are frequently employed in practice. However, Reinforcement Learning offers an appealing alternative to traditional heuristics, potentially outperforming existing methods in terms of speed and accuracy. We introduce an attention based neural network for modeling picker tours, which is trained using Reinforcement Learning. Our method is evaluated against existing heuristics across a range of problem parameters to demonstrate its efficacy. A key advantage of our proposed method is its ability to offer an option to reduce the perceived complexity of routes.
LGOct 17, 2025
Reflections from Research Roundtables at the Conference on Health, Inference, and Learning (CHIL) 2025Emily Alsentzer, Marie-Laure Charpignon, Bill Chen et al.
The 6th Annual Conference on Health, Inference, and Learning (CHIL 2025), hosted by the Association for Health Learning and Inference (AHLI), was held in person on June 25-27, 2025, at the University of California, Berkeley, in Berkeley, California, USA. As part of this year's program, we hosted Research Roundtables to catalyze collaborative, small-group dialogue around critical, timely topics at the intersection of machine learning and healthcare. Each roundtable was moderated by a team of senior and junior chairs who fostered open exchange, intellectual curiosity, and inclusive engagement. The sessions emphasized rigorous discussion of key challenges, exploration of emerging opportunities, and collective ideation toward actionable directions in the field. In total, eight roundtables were held by 19 roundtable chairs on topics of "Explainability, Interpretability, and Transparency," "Uncertainty, Bias, and Fairness," "Causality," "Domain Adaptation," "Foundation Models," "Learning from Small Medical Data," "Multimodal Methods," and "Scalable, Translational Healthcare Solutions."
LGJul 16, 2025
Ranking Vectors Clustering: Theory and ApplicationsAli Fattahi, Ali Eshragh, Babak Aslani et al.
We study the problem of clustering ranking vectors, where each vector represents preferences as an ordered list of distinct integers. Specifically, we focus on the k-centroids ranking vectors clustering problem (KRC), which aims to partition a set of ranking vectors into k clusters and identify the centroid of each cluster. Unlike classical k-means clustering (KMC), KRC constrains both the observations and centroids to be ranking vectors. We establish the NP-hardness of KRC and characterize its feasible set. For the single-cluster case, we derive a closed-form analytical solution for the optimal centroid, which can be computed in linear time. To address the computational challenges of KRC, we develop an efficient approximation algorithm, KRCA, which iteratively refines initial solutions from KMC, referred to as the baseline solution. Additionally, we introduce a branch-and-bound (BnB) algorithm for efficient cluster reconstruction within KRCA, leveraging a decision tree framework to reduce computational time while incorporating a controlling parameter to balance solution quality and efficiency. We establish theoretical error bounds for KRCA and BnB. Through extensive numerical experiments on synthetic and real-world datasets, we demonstrate that KRCA consistently outperforms baseline solutions, delivering significant improvements in solution quality with fast computational times. This work highlights the practical significance of KRC for personalization and large-scale decision making, offering methodological advancements and insights that can be built upon in future studies.
MLDec 24, 2021
Toeplitz Least Squares Problems, Fast Algorithms and Big DataAli Eshragh, Oliver Di Pietro, Michael A. Saunders
In time series analysis, when fitting an autoregressive model, one must solve a Toeplitz ordinary least squares problem numerous times to find an appropriate model, which can severely affect computational times with large data sets. Two recent algorithms (LSAR and Repeated Halving) have applied randomized numerical linear algebra (RandNLA) techniques to fitting an autoregressive model to big time-series data. We investigate and compare the quality of these two approximation algorithms on large-scale synthetic and real-world data. While both algorithms display comparable results for synthetic datasets, the LSAR algorithm appears to be more robust when applied to real-world time series data. We conclude that RandNLA is effective in the context of big-data time series.
LGOct 18, 2020
Average-reward model-free reinforcement learning: a systematic review and literature mappingVektor Dewanto, George Dunn, Ali Eshragh et al.
Reinforcement learning is important part of artificial intelligence. In this paper, we review model-free reinforcement learning that utilizes the average reward optimality criterion in the infinite horizon setting. Motivated by the solo survey by Mahadevan (1996a), we provide an updated review of work in this area and extend it to cover policy-iteration and function approximation methods (in addition to the value-iteration and tabular counterparts). We present a comprehensive literature mapping. We also identify and discuss opportunities for future work.
MENov 27, 2019
LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series DataAli Eshragh, Fred Roosta, Asef Nazari et al.
We apply methods from randomized numerical linear algebra (RandNLA) to develop improved algorithms for the analysis of large-scale time series data. We first develop a new fast algorithm to estimate the leverage scores of an autoregressive (AR) model in big data regimes. We show that the accuracy of approximations lies within $(1+\bigO{\varepsilon})$ of the true leverage scores with high probability. These theoretical results are subsequently exploited to develop an efficient algorithm, called LSAR, for fitting an appropriate AR model to big time series data. Our proposed algorithm is guaranteed, with high probability, to find the maximum likelihood estimates of the parameters of the underlying true AR model and has a worst case running time that significantly improves those of the state-of-the-art alternatives in big data regimes. Empirical results on large-scale synthetic as well as real data highly support the theoretical results and reveal the efficacy of this new approach.
MLJan 30, 2019
Learning to Project in Multi-Objective Binary Linear ProgrammingAlvaro Sierra-Altamiranda, Hadi Charkhgard, Iman Dayarian et al.
In this paper, we investigate the possibility of improving the performance of multi-objective optimization solution approaches using machine learning techniques. Specifically, we focus on multi-objective binary linear programs and employ one of the most effective and recently developed criterion space search algorithms, the so-called KSA, during our study. This algorithm computes all nondominated points of a problem with p objectives by searching on a projected criterion space, i.e., a (p-1)-dimensional criterion apace. We present an effective and fast learning approach to identify on which projected space the KSA should work. We also present several generic features/variables that can be used in machine learning techniques for identifying the best projected space. Finally, we present an effective bi-objective optimization based heuristic for selecting the best subset of the features to overcome the issue of overfitting in learning. Through an extensive computational study over 2000 instances of tri-objective Knapsack and Assignment problems, we demonstrate that an improvement of up to 12% in time can be achieved by the proposed learning method compared to a random selection of the projected space.