William B. Haskell

9papers

149citations

Novelty50%

AI Score25

Ranked #169,975 of 205,806 authors (top 83%)#36,923 in LG (top 87%)

9 Papers

SYMay 16, 2017

Approximate Value Iteration for Risk-aware Markov Decision Processes

Pengqian Yu, William B. Haskell, Huan Xu

We consider large-scale Markov decision processes (MDPs) with a risk measure of variability in cost, under the risk-aware MDPs paradigm. Previous studies showed that risk-aware MDPs, based on a minimax approach to handling risk, can be solved using dynamic programming for small to medium sized problems. However, due to the "curse of dimensionality", MDPs that model real-life problems are typically prohibitively large for such approaches. In this paper, we employ an approximate dynamic programming approach, and develop a family of simulation-based algorithms to approximately solve large-scale risk-aware MDPs. In parallel, we develop a unified convergence analysis technique to derive sample complexity bounds for this new family of algorithms.

LGNov 2, 2022

Learning to Price Supply Chain Contracts against a Learning Retailer

Xuejun Zhao, Ruihao Zhu, William B. Haskell

The rise of big data analytics has automated the decision-making of companies and increased supply chain agility. In this paper, we study the supply chain contract design problem faced by a data-driven supplier who needs to respond to the inventory decisions of the downstream retailer. Both the supplier and the retailer are uncertain about the market demand and need to learn about it sequentially. The goal for the supplier is to develop data-driven pricing policies with sublinear regret bounds under a wide range of possible retailer inventory policies for a fixed time horizon. To capture the dynamics induced by the retailer's learning policy, we first make a connection to non-stationary online learning by following the notion of variation budget. The variation budget quantifies the impact of the retailer's learning strategy on the supplier's decision-making. We then propose dynamic pricing policies for the supplier for both discrete and continuous demand. We also note that our proposed pricing policy only requires access to the support of the demand distribution, but critically, does not require the supplier to have any prior knowledge about the retailer's learning policy or the demand realizations. We examine several well-known data-driven policies for the retailer, including sample average approximation, distributionally robust optimization, and parametric approaches, and show that our pricing policies lead to sublinear regret bounds in all these cases. At the managerial level, we answer affirmatively that there is a pricing policy with a sublinear regret bound under a wide range of retailer's learning policies, even though she faces a learning retailer and an unknown demand distribution. Our work also provides a novel perspective in data-driven operations management where the principal has to learn to react to the learning policies employed by other agents in the system.

LGMar 25, 2020

Convergence of Recursive Stochastic Algorithms using Wasserstein Divergence

Abhishek Gupta, William B. Haskell

This paper develops a unified framework, based on iterated random operator theory, to analyze the convergence of constant stepsize recursive stochastic algorithms (RSAs). RSAs use randomization to efficiently compute expectations, and so their iterates form a stochastic process. The key idea of our analysis is to lift the RSA into an appropriate higher-dimensional space and then express it as an equivalent Markov chain. Instead of determining the convergence of this Markov chain (which may not converge under constant stepsize), we study the convergence of the distribution of this Markov chain. To study this, we define a new notion of Wasserstein divergence. We show that if the distribution of the iterates in the Markov chain satisfy a contraction property with respect to the Wasserstein divergence, then the Markov chain admits an invariant distribution. We show that convergence of a large family of constant stepsize RSAs can be understood using this framework, and we provide several detailed examples.

MLJun 22, 2019

A Unifying Framework for Variance Reduction Algorithms for Finding Zeroes of Monotone Operators

Xun Zhang, William B. Haskell, Zhisheng Ye

It is common to encounter large-scale monotone inclusion problems where the objective has a finite sum structure. We develop a general framework for variance-reduced forward-backward splitting algorithms for this problem. This framework includes a number of existing deterministic and variance-reduced algorithms for function minimization as special cases, and it is also applicable to more general problems such as saddle-point problems and variational inequalities. With a carefully constructed Lyapunov function, we show that the algorithms covered by our framework enjoy a linear convergence rate in expectation under mild assumptions. We further consider Catalyst acceleration and asynchronous implementation to reduce the algorithmic complexity and computation time. We apply our proposed framework to a policy evaluation problem and a strongly monotone two-player game, both of which fall outside of function minimization.

SYOct 9, 2018

Distributionally Robust Optimization for Sequential Decision Making

Zhi Chen, Pengqian Yu, William B. Haskell

The distributionally robust Markov Decision Process (MDP) approach asks for a distributionally robust policy that achieves the maximal expected total reward under the most adversarial distribution of uncertain parameters. In this paper, we study distributionally robust MDPs where ambiguity sets for the uncertain parameters are of a format that can easily incorporate in its description the uncertainty's generalized moment as well as statistical distance information. In this way, we generalize existing works on distributionally robust MDP with generalized-moment-based and statistical-distance-based ambiguity sets to incorporate information from the former class such as moments and dispersions to the latter class that critically depends on empirical observations of the uncertain parameters. We show that, under this format of ambiguity sets, the resulting distributionally robust MDP remains tractable under mild technical conditions. To be more specific, a distributionally robust policy can be constructed by solving a sequence of one-stage convex optimization subproblems.

RMMay 17, 2018

Preference Elicitation and Robust Optimization with Multi-Attribute Quasi-Concave Choice Functions

William B. Haskell, Wenjie Huang, Huifu Xu

Decision maker's preferences are often captured by some choice functions which are used to rank prospects. In this paper, we consider ambiguity in choice functions over a multi-attribute prospect space. Our main result is a robust preference model where the optimal decision is based on the worst-case choice function from an ambiguity set constructed through preference elicitation with pairwise comparisons of prospects. Differing from existing works in the area, our focus is on quasi-concave choice functions rather than concave functions and this enables us to cover a wide range of utility/risk preference problems including multi-attribute expected utility and $S$-shaped aspirational risk preferences. The robust choice function is increasing and quasi-concave but not necessarily translation invariant, a key property of monetary risk measures. We propose two approaches based respectively on the support functions and level functions of quasi-concave functions to develop tractable formulations of the maximin preference robust optimization model. The former gives rise to a mixed integer linear programming problem whereas the latter is equivalent to solving a sequence of convex risk minimization problems. To assess the effectiveness of the proposed robust preference optimization model and numerical schemes, we apply them to a security budget allocation problem and report some preliminary results from experiments.

OCMay 11, 2018

Stochastic Approximation for Risk-aware Markov Decision Processes

Wenjie Huang, William B. Haskell

We develop a stochastic approximation-type algorithm to solve finite state/action, infinite-horizon, risk-aware Markov decision processes. Our algorithm has two loops. The inner loop computes the risk by solving a stochastic saddle-point problem. The outer loop performs $Q$-learning to compute an optimal risk-aware policy. Several widely investigated risk measures (e.g. conditional value-at-risk, optimized certainty equivalent, and absolute semi-deviation) are covered by our algorithm. Almost sure convergence and the convergence rate of the algorithm are established. For an error tolerance $ε>0$ for the optimal $Q$-value estimation gap and learning rate $k\in(1/2,\,1]$, the overall convergence rate of our algorithm is $Ω((\ln(1/δε)/ε^{2})^{1/k}+(\ln(1/ε))^{1/(1-k)})$ with probability at least $1-δ$.

MLMay 19, 2017

A Unified Framework for Stochastic Matrix Factorization via Variance Reduction

Renbo Zhao, William B. Haskell, Jiashi Feng

We propose a unified framework to speed up the existing stochastic matrix factorization (SMF) algorithms via variance reduction. Our framework is general and it subsumes several well-known SMF formulations in the literature. We perform a non-asymptotic convergence analysis of our framework and derive computational and sample complexities for our algorithm to converge to an $ε$-stationary point in expectation. In addition, extensive experiments for a wide class of SMF formulations demonstrate that our framework consistently yields faster convergence and a more accurate output dictionary vis-à-vis state-of-the-art frameworks.

OCApr 1, 2017

Stochastic L-BFGS: Improved Convergence Rates and Practical Acceleration Strategies

Renbo Zhao, William B. Haskell, Vincent Y. F. Tan

We revisit the stochastic limited-memory BFGS (L-BFGS) algorithm. By proposing a new framework for the convergence analysis, we prove improved convergence rates and computational complexities of the stochastic L-BFGS algorithms compared to previous works. In addition, we propose several practical acceleration strategies to speed up the empirical performance of such algorithms. We also provide theoretical analyses for most of the strategies. Experiments on large-scale logistic and ridge regression problems demonstrate that our proposed strategies yield significant improvements vis-à-vis competing state-of-the-art algorithms.