LGNov 29, 2023
Learning to Simulate: Generative Metamodeling via Quantile RegressionL. Jeff Hong, Yanxi Hou, Qingkai Zhang et al.
Stochastic simulation models effectively capture complex system dynamics but are often too slow for real-time decision-making. Traditional metamodeling techniques learn relationships between simulator inputs and a single output summary statistic, such as the mean or median. These techniques enable real-time predictions without additional simulations. However, they require prior selection of one appropriate output summary statistic, limiting their flexibility in practical applications. We propose a new concept: generative metamodeling. It aims to construct a "fast simulator of the simulator," generating random outputs significantly faster than the original simulator while preserving approximately equal conditional distributions. Generative metamodels enable rapid generation of numerous random outputs upon input specification, facilitating immediate computation of any summary statistic for real-time decision-making. We introduce a new algorithm, quantile-regression-based generative metamodeling (QRGMM), and establish its distributional convergence and convergence rate. Extensive numerical experiments demonstrate QRGMM's efficacy compared to other state-of-the-art generative algorithms in practical real-time decision-making scenarios.
MLAug 18, 2024
Efficient Budget Allocation for Large-Scale LLM-Enabled Virtual ScreeningZaile Li, Weiwei Fan, L. Jeff Hong
Screening tasks that aim to identify a small subset of top alternatives from a large pool are common in business decision-making processes. These tasks often require substantial human effort to evaluate each alternative's performance, making them time-consuming and costly. Motivated by recent advances in large language models (LLMs), particularly their ability to generate outputs that align well with human evaluations, we consider an LLM-as-human-evaluator approach for conducting screening virtually, thereby reducing the cost burden. To achieve scalability and cost-effectiveness in virtual screening, we identify that the stochastic nature of LLM outputs and their cost structure necessitate efficient budget allocation across all alternatives. To address this, we propose using a top-$m$ greedy evaluation mechanism, a simple yet effective approach that keeps evaluating the current top-$m$ alternatives, and design the explore-first top-$m$ greedy (EFG-$m$) algorithm. We prove that EFG-$m$ is both sample-optimal and consistent in large-scale virtual screening. Surprisingly, we also uncover a bonus ranking effect, where the algorithm naturally induces an indifference-based ranking within the selected subset. To further enhance practicality, we design a suite of algorithm variants to improve screening performance and computational efficiency. Numerical experiments validate our results and demonstrate the effectiveness of our algorithms. Lastly, we conduct a case study on LLM-based virtual screening. The study shows that while LLMs alone may not provide meaningful screening and ranking results when directly queried, integrating them with our sample-optimal algorithms unlocks their potential for cost-effective, large-scale virtual screening.
MLNov 27, 2025
UCB for Large-Scale Pure Exploration: Beyond Sub-GaussianityZaile Li, Weiwei Fan, L. Jeff Hong
Selecting the best alternative from a finite set represents a broad class of pure exploration problems. Traditional approaches to pure exploration have predominantly relied on Gaussian or sub-Gaussian assumptions on the performance distributions of all alternatives, which limit their applicability to non-sub-Gaussian especially heavy-tailed problems. The need to move beyond sub-Gaussianity may become even more critical in large-scale problems, which tend to be especially sensitive to distributional specifications. In this paper, motivated by the widespread use of upper confidence bound (UCB) algorithms in pure exploration and beyond, we investigate their performance in the large-scale, non-sub-Gaussian settings. We consider the simplest category of UCB algorithms, where the UCB value for each alternative is defined as the sample mean plus an exploration bonus that depends only on its own sample size. We abstract this into a meta-UCB algorithm and propose letting it select the alternative with the largest sample size as the best upon stopping. For this meta-UCB algorithm, we first derive a distribution-free lower bound on the probability of correct selection. Building on this bound, we analyze two general non-sub-Gaussian scenarios: (1) all alternatives follow a common location-scale structure and have bounded variance; and (2) when such a structure does not hold, each alternative has a bounded absolute moment of order $q > 3$. In both settings, we show that the meta-UCB algorithm and therefore a broad class of UCB algorithms can achieve the sample optimality. These results demonstrate the applicability of UCB algorithms for solving large-scale pure exploration problems with non-sub-Gaussian distributions. Numerical experiments support our results and provide additional insights into the comparative behaviors of UCB algorithms within and beyond our meta-UCB framework.
MLSep 7, 2025
Additive Distributionally Robust Ranking and SelectionZaile Li, Yuchen Wan, L. Jeff Hong
Ranking and selection (R&S) aims to identify the alternative with the best mean performance among $k$ simulated alternatives. The practical value of R&S depends on accurate simulation input modeling, which often suffers from the curse of input uncertainty due to limited data. Distributionally robust ranking and selection (DRR&S) addresses this challenge by modeling input uncertainty via an ambiguity set of $m > 1$ plausible input distributions, resulting in $km$ scenarios in total. Recent DRR&S studies suggest a key structural insight: additivity in budget allocation is essential for efficiency. However, existing justifications are heuristic, and fundamental properties such as consistency and the precise allocation pattern induced by additivity remain poorly understood. In this paper, we propose a simple additive allocation (AA) procedure that aims to exclusively sample the $k + m - 1$ previously hypothesized critical scenarios. Leveraging boundary-crossing arguments, we establish a lower bound on the probability of correct selection and characterize the procedure's budget allocation behavior. We then prove that AA is consistent and, surprisingly, achieves additivity in the strongest sense: as the total budget increases, only $k + m - 1$ scenarios are sampled infinitely often. Notably, the worst-case scenarios of non-best alternatives may not be among them, challenging prior beliefs about their criticality. These results offer new and counterintuitive insights into the additive structure of DRR&S. To improve practical performance while preserving this structure, we introduce a general additive allocation (GAA) framework that flexibly incorporates sampling rules from traditional R&S procedures in a modular fashion. Numerical experiments support our theoretical findings and demonstrate the competitive performance of the proposed GAA procedures.
LGJun 18, 2025
Conditional Generative Modeling for Enhanced Credit Risk Management in Supply Chain FinanceQingkai Zhang, L. Jeff Hong, Houmin Yan
The rapid expansion of cross-border e-commerce (CBEC) has created significant opportunities for small- and medium-sized sellers, yet financing remains a critical challenge due to their limited credit histories. Third-party logistics (3PL)-led supply chain finance (SCF) has emerged as a promising solution, leveraging in-transit inventory as collateral. We propose an advanced credit risk management framework tailored for 3PL-led SCF, addressing the dual challenges of credit risk assessment and loan size determination. Specifically, we leverage conditional generative modeling of sales distributions through Quantile-Regression-based Generative Metamodeling (QRGMM) as the foundation for risk measures estimation. We propose a unified framework that enables flexible estimation of multiple risk measures while introducing a functional risk measure formulation that systematically captures the relationship between these risk measures and varying loan levels, supported by theoretical guarantees. To capture complex covariate interactions in e-commerce sales data, we integrate QRGMM with Deep Factorization Machines (DeepFM). Extensive experiments on synthetic and real-world data validate the efficacy of our model for credit risk assessment and loan size determination. This study explores the use of generative models in CBEC SCF risk management, illustrating their potential to strengthen credit assessment and support financing for small- and medium-sized sellers.
AIMay 24, 2025
LLMs for Supply Chain ManagementHaojie Wang, Jiuyun Jiang, L. Jeff Hong et al.
The development of large language models (LLMs) has provided new tools for research in supply chain management (SCM). In this paper, we introduce a retrieval-augmented generation (RAG) framework that dynamically integrates external knowledge into the inference process, and develop a domain-specialized SCM LLM, which demonstrates expert-level competence by passing standardized SCM examinations and beer game tests. We further employ the use of LLMs to conduct horizontal and vertical supply chain games, in order to analyze competition and cooperation within supply chains. Our experiments show that RAG significantly improves performance on SCM tasks. Moreover, game-theoretic analysis reveals that the LLM can reproduce insights from the classical SCM literature, while also uncovering novel behaviors and offering fresh perspectives on phenomena such as the bullwhip effect. This paper opens the door for exploring cooperation and competition for complex supply chain network through the lens of LLMs.
LGFeb 1, 2024
AlphaRank: An Artificial Intelligence Approach for Ranking and Selection ProblemsRuihan Zhou, L. Jeff Hong, Yijie Peng
We introduce AlphaRank, an artificial intelligence approach to address the fixed-budget ranking and selection (R&S) problems. We formulate the sequential sampling decision as a Markov decision process and propose a Monte Carlo simulation-based rollout policy that utilizes classic R&S procedures as base policies for efficiently learning the value function of stochastic dynamic programming. We accelerate online sample-allocation by using deep reinforcement learning to pre-train a neural network model offline based on a given prior. We also propose a parallelizable computing framework for large-scale problems, effectively combining "divide and conquer" and "recursion" for enhanced scalability and efficiency. Numerical experiments demonstrate that the performance of AlphaRank is significantly improved over the base policies, which could be attributed to AlphaRank's superior capability on the trade-off among mean, variance, and induced correlation overlooked by many existing policies.
OCJan 15, 2022
Large-Scale Inventory Optimization: A Recurrent-Neural-Networks-Inspired Simulation ApproachTan Wan, L. Jeff Hong
Many large-scale production networks include thousands types of final products and tens to hundreds thousands types of raw materials and intermediate products. These networks face complicated inventory management decisions, which are often too complicated for inventory models and too large for simulation models. In this paper, by combing efficient computational tools of recurrent neural networks (RNN) and the structural information of production networks, we propose a RNN inspired simulation approach that may be thousands times faster than existing simulation approach and is capable of solving large-scale inventory optimization problems in a reasonable amount of time.
LGSep 17, 2020
Dimension Reduction in Contextual Online Learning via Nonparametric Variable SelectionWenhao Li, Ningyuan Chen, L. Jeff Hong
We consider a contextual online learning (multi-armed bandit) problem with high-dimensional covariate $\mathbf{x}$ and decision $\mathbf{y}$. The reward function to learn, $f(\mathbf{x},\mathbf{y})$, does not have a particular parametric form. The literature has shown that the optimal regret is $\tilde{O}(T^{(d_x+d_y+1)/(d_x+d_y+2)})$, where $d_x$ and $d_y$ are the dimensions of $\mathbf x$ and $\mathbf y$, and thus it suffers from the curse of dimensionality. In many applications, only a small subset of variables in the covariate affect the value of $f$, which is referred to as \textit{sparsity} in statistics. To take advantage of the sparsity structure of the covariate, we propose a variable selection algorithm called \textit{BV-LASSO}, which incorporates novel ideas such as binning and voting to apply LASSO to nonparametric settings. Our algorithm achieves the regret $\tilde{O}(T^{(d_x^*+d_y+1)/(d_x^*+d_y+2)})$, where $d_x^*$ is the effective covariate dimension. The regret matches the optimal regret when the covariate is $d^*_x$-dimensional and thus cannot be improved. Our algorithm may serve as a general recipe to achieve dimension reduction via variable selection in nonparametric settings.
MLJul 15, 2019
A Dimension-free Algorithm for Contextual Continuum-armed BanditsWenhao Li, Ningyuan Chen, L. Jeff Hong
In contextual continuum-armed bandits, the contexts $x$ and the arms $y$ are both continuous and drawn from high-dimensional spaces. The payoff function to learn $f(x,y)$ does not have a particular parametric form. The literature has shown that for Lipschitz-continuous functions, the optimal regret is $\tilde{O}(T^{\frac{d_x+d_y+1}{d_x+d_y+2}})$, where $d_x$ and $d_y$ are the dimensions of contexts and arms, and thus suffers from the curse of dimensionality. We develop an algorithm that achieves regret $\tilde{O}(T^{\frac{d_x+1}{d_x+2}})$ when $f$ is globally concave in $y$. The global concavity is a common assumption in many applications. The algorithm is based on stochastic approximation and estimates the gradient information in an online fashion. Our results generate a valuable insight that the curse of dimensionality of the arms can be overcome with some mild structures of the payoff function.
MLOct 7, 2017
Ranking and Selection with Covariates for Personalized Decision MakingHaihui Shen, L. Jeff Hong, Xiaowei Zhang
We consider a problem of ranking and selection via simulation in the context of personalized decision making, where the best alternative is not universal but varies as a function of some observable covariates. The goal of ranking and selection with covariates (R&S-C) is to use simulation samples to obtain a selection policy that specifies the best alternative with certain statistical guarantee for subsequent individuals upon observing their covariates. A linear model is proposed to capture the relationship between the mean performance of an alternative and the covariates. Under the indifference-zone formulation, we develop two-stage procedures for both homoscedastic and heteroscedastic simulation errors, respectively, and prove their statistical validity in terms of average probability of correct selection. We also generalize the well-known slippage configuration, and prove that the generalized slippage configuration is the least favorable configuration for our procedures. Extensive numerical experiments are conducted to investigate the performance of the proposed procedures, the experimental design issue, and the robustness to the linearity assumption. Finally, we demonstrate the usefulness of R&S-C via a case study of selecting the best treatment regimen in the prevention of esophageal cancer. We find that by leveraging disease-related personal information, R&S-C can substantially improve patients' expected quality-adjusted life years by providing patient-specific treatment regimen.