MLOct 21, 2025
The Bias-Variance Tradeoff in Data-Driven Optimization: A Local Misspecification PerspectiveHaixiang Lan, Luofeng Liao, Adam N. Elmachtoub et al.
Data-driven stochastic optimization is ubiquitous in machine learning and operational decision-making problems. Sample average approximation (SAA) and model-based approaches such as estimate-then-optimize (ETO) or integrated estimation-optimization (IEO) are all popular, with model-based approaches being able to circumvent some of the issues with SAA in complex context-dependent problems. Yet the relative performance of these methods is poorly understood, with most results confined to the dichotomous cases of the model-based approach being either well-specified or misspecified. We develop the first results that allow for a more granular analysis of the relative performance of these methods under a local misspecification setting, which models the scenario where the model-based approach is nearly well-specified. By leveraging tools from contiguity theory in statistics, we show that there is a bias-variance tradeoff between SAA, IEO, and ETO under local misspecification, and that the relative importance of the bias and the variance depends on the degree of local misspecification. Moreover, we derive explicit expressions for the decision bias, which allows us to characterize (un)impactful misspecification directions, and provide further geometric understanding of the variance.
LGJun 18, 2021
Local AdaGrad-Type Algorithm for Stochastic Convex-Concave OptimizationLuofeng Liao, Li Shen, Jia Duan et al.
Large scale convex-concave minimax problems arise in numerous applications, including game theory, robust training, and training of generative adversarial networks. Despite their wide applicability, solving such problems efficiently and effectively is challenging in the presence of large amounts of data using existing stochastic minimax methods. We study a class of stochastic minimax methods and develop a communication-efficient distributed stochastic extragradient algorithm, LocalAdaSEG, with an adaptive learning rate suitable for solving convex-concave minimax problems in the Parameter-Server model. LocalAdaSEG has three main features: (i) a periodic communication strategy that reduces the communication cost between workers and the server; (ii) an adaptive learning rate that is computed locally and allows for tuning-free implementation; and (iii) theoretically, a nearly linear speed-up with respect to the dominant variance term, arising from the estimation of the stochastic gradient, is proven in both the smooth and nonsmooth convex-concave settings. LocalAdaSEG is used to solve a stochastic bilinear game, and train a generative adversarial network. We compare LocalAdaSEG against several existing optimizers for minimax problems and demonstrate its efficacy through several experiments in both homogeneous and heterogeneous settings.
MLFeb 19, 2021
Instrumental Variable Value Iteration for Causal Offline Reinforcement LearningLuofeng Liao, Zuyue Fu, Zhuoran Yang et al.
In offline reinforcement learning (RL) an optimal policy is learned solely from a priori collected observational data. However, in observational data, actions are often confounded by unobserved variables. Instrumental variables (IVs), in the context of RL, are the variables whose influence on the state variables is all mediated by the action. When a valid instrument is present, we can recover the confounded transition dynamics through observational data. We study a confounded Markov decision process where the transition dynamics admit an additive nonlinear functional form. Using IVs, we derive a conditional moment restriction through which we can identify transition dynamics based on observational data. We propose a provably efficient IV-aided Value Iteration (IVVI) algorithm based on a primal-dual reformulation of the conditional moment restriction. To our knowledge, this is the first provably efficient algorithm for instrument-aided offline RL.
MLJul 2, 2020
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial ApproachLuofeng Liao, You-Lin Chen, Zhuoran Yang et al.
Structural equation models (SEMs) are widely used in sciences, ranging from economics to psychology, to uncover causal relationships underlying a complex system under consideration and estimate structural parameters of interest. We study estimation in a class of generalized SEMs where the object of interest is defined as the solution to a linear operator equation. We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using the stochastic gradient descent. We consider both 2-layer and multi-layer NNs with ReLU activation functions and prove global convergence in an overparametrized regime, where the number of neurons is diverging. The results are established using techniques from online learning and local linearization of NNs, and improve in several aspects the current state-of-the-art. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.