TRJan 26, 2012
Liquidation in Limit Order Books with Controlled IntensityErhan Bayraktar, Michael Ludkovski
We consider a framework for solving optimal liquidation problems in limit order books. In particular, order arrivals are modeled as a point process whose intensity depends on the liquidation price. We set up a stochastic control problem in which the goal is to maximize the expected revenue from liquidating the entire position held. We solve this optimal liquidation problem for power-law and exponential-decay order book models and discuss several extensions. We also consider the continuous selling (or fluid) limit when the trading units are ever smaller and the intensity is ever larger. This limit provides an analytical approximation to the value function and the optimal solution. Using techniques from viscosity solutions we show that the discrete state problem and its optimal solution converge to the corresponding quantities in the continuous selling limit uniformly on compacts.
PRSep 24, 2013
Stochastic Perron's method for Hamilton-Jacobi-Bellman equationsErhan Bayraktar, Mihai Sirbu
We show that the value function of a stochastic control problem is the unique solution of the associated Hamilton-Jacobi-Bellman (HJB) equation, completely avoiding the proof of the so-called dynamic programming principle (DPP). Using Stochastic Perron's method we construct a super-solution lying below the value function and a sub-solution dominating it. A comparison argument easily closes the proof. The program has the precise meaning of verification for viscosity-solutions, obtaining the DPP as a conclusion. It also immediately follows that the weak and strong formulations of the stochastic control problem have the same value. Using this method we also capture the possible face-lifting phenomenon in a straightforward manner.
OCJan 14, 2013
On the Multi-Dimensional Controller and Stopper GamesErhan Bayraktar, Yu-Jui Huang
We consider a zero-sum stochastic differential controller-and-stopper game in which the state process is a controlled diffusion evolving in a multi-dimensional Euclidean space. In this game, the controller affects both the drift and the volatility terms of the state process. Under appropriate conditions, we show that the game has a value and the value function is the unique viscosity solution to an obstacle problem for a Hamilton-Jacobi-Bellman equation.
PRJul 13, 2011
Stochatic Perron's method and verification without smoothness using viscosity comparison: the linear caseErhan Bayraktar, Mihai Sirbu
We introduce a probabilistic version of the classical Perron's method to construct viscosity solutions to linear parabolic equations associated to stochastic differential equations. Using this method, we construct easily two viscosity (sub and super) solutions that squeeze in between the expected payoff. If a comparison result holds true, then there exists a unique viscosity solution which is a martingale along the solutions of the stochastic differential equation. The unique viscosity solution is actually equal to the expected payoff. This amounts to a verification result (Ito's Lemma) for non-smooth viscosity solutions of the linear parabolic equation. This is the first step in a larger program to prove verification for viscosity solutions and the Dynamic Programming Principle for stochastic control problems and games
PRApr 11, 2016
On the Robust Optimal Stopping ProblemErhan Bayraktar, Song Yao
We study a robust optimal stopping problem with respect to a set $\cP$ of mutually singular probabilities. This can be interpreted as a zero-sum controller-stopper game in which the stopper is trying to maximize its pay-off while an adverse player wants to minimize this payoff by choosing an evaluation criteria from $\cP$. We show that the \emph{upper Snell envelope $\ol{Z}$} of the reward process $Y$ is a supermartingale with respect to an appropriately defined nonlinear expectation $\ul{\sE}$, and $\ol{Z}$ is further an $\ul{\sE}-$martingale up to the first time $\t^*$ when $\ol{Z}$ meets $Y$. Consequently, $\t^*$ is the optimal stopping time for the robust optimal stopping problem and the corresponding zero-sum game has a value. Although the result seems similar to the one obtained in the classical optimal stopping theory, the mutual singularity of probabilities and the game aspect of the problem give rise to major technical hurdles, which we circumvent using some new methods.
RADec 25, 2018
High order Bellman equations and weakly chained diagonally dominant tensorsParsiad Azimzadeh, Erhan Bayraktar
We introduce high order Bellman equations, extending classical Bellman equations to the tensor setting. We introduce weakly chained diagonally dominant (w.c.d.d.) tensors and show that a sufficient condition for the existence and uniqueness of a positive solution to a high order Bellman equation is that the tensors appearing in the equation are w.c.d.d. M-tensors. In this case, we give a policy iteration algorithm to compute this solution. We also prove that a weakly diagonally dominant Z-tensor with nonnegative diagonals is a strong M-tensor if and only if it is w.c.d.d. This last point is analogous to a corresponding result in the matrix setting and tightens a result from [L. Zhang, L. Qi, and G. Zhou. "M-tensors and some applications." SIAM Journal on Matrix Analysis and Applications (2014)]. We apply our results to obtain a provably convergent numerical scheme for an optimal control problem using an "optimize then discretize" approach which outperforms (in both computation time and accuracy) a classical "discretize then optimize" approach. To the best of our knowledge, a link between M-tensors and optimal control has not been previously established.
PMMay 5, 2011
Minimizing the Probability of Lifetime Ruin under Stochastic VolatilityErhan Bayraktar, Xueying Hu, Virginia R. Young
We assume that an individual invests in a financial market with one riskless and one risky asset, with the latter's price following a diffusion with stochastic volatility. In the current financial market especially, it is important to include stochastic volatility in the risky asset's price process. Given the rate of consumption, we find the optimal investment strategy for the individual who wishes to minimize the probability of going bankrupt. To solve this minimization problem, we use techniques from stochastic optimal control.
PRDec 3, 2014
Quickest Detection with Discretely Controlled ObservationsErhan Bayraktar, Ross Kravitz
We study a continuous time Bayesian quickest detection problem in which observation times are a scarce resource. The agent, limited to making a finite number of discrete observations, must adaptively decide his observation strategy to minimize detection delay and the probability of false alarm. Under two different models of observation rights, we establish the existence of optimal strategies, and formulate an algorithmic approach to the problem via jump operators. We describe algorithms for these problems, and illustrate them with some numerical results. As the number of observation rights tends to infinity, we also show convergence to the classical continuous observation problem of Shiryaev.
OCAug 27, 2018
A numerical scheme for a mean field game in some queueing systems based on Markov chain approximation methodErhan Bayraktar, Amarjit Budhiraja, Asaf Cohen
We use the Markov chain approximation method to construct approximations for the solution of the mean field game (MFG) with reflecting barriers studied in Bayraktar, Budhiraja, and Cohen (2017). The MFG is formulated in terms of a controlled reflected diffusion with a cost function that depends on the reflection terms in addition to the standard variables: state, control, and the mean field term. This MFG arises from the asymptotic analysis of an $N$-player game for single server queues with strategic servers. By showing that our scheme is an almost contraction, we establish the convergence of this numerical scheme over a small time interval.
MLJun 22, 2023
Fitted value iteration methods for bicausal optimal transportErhan Bayraktar, Bingyan Han
We develop a fitted value iteration (FVI) method to compute bicausal optimal transport (OT) where couplings have an adapted structure. Based on the dynamic programming formulation, FVI adopts a function class to approximate the value functions in bicausal OT. Under the concentrability condition and approximate completeness assumption, we prove the sample complexity using (local) Rademacher complexity. Furthermore, we demonstrate that multilayer neural networks with appropriate structures satisfy the crucial assumptions required in sample complexity proofs. Numerical experiments reveal that FVI outperforms linear programming and adapted Sinkhorn methods in scalability as the time horizon increases, while still maintaining acceptable accuracy.
CPNov 21, 2022
Deep Signature Algorithm for Multi-dimensional Path-Dependent OptionsErhan Bayraktar, Qi Feng, Zhaoyu Zhang
In this work, we study the deep signature algorithms for path-dependent options. We extend the backward scheme in [Huré-Pham-Warin. Mathematics of Computation 89, no. 324 (2020)] for state-dependent FBSDEs with reflections to path-dependent FBSDEs with reflections, by adding the signature layer to the backward scheme. Our algorithm applies to both European and American type option pricing problems while the payoff function depends on the whole paths of the underlying forward stock process. We prove the convergence analysis of our numerical algorithm with explicit dependence on the truncation order of the signature and the neural network approximation errors. Numerical examples for the algorithm are provided including: Amerasian option under the Black-Scholes model, American option with a path-dependent geometric mean payoff function, and the Shiryaev's optimal stopping problem.
LGMay 6
Conditional Diffusion Under Linear Constraints: Langevin Mixing and Information-Theoretic GuaranteesAhmad Aghapour, Erhan Bayraktar, Asaf Cohen
We study zero-shot conditional sampling with pretrained diffusion models for linear inverse problems, including inpainting and super-resolution. In these problems, the observation determines only part of the unknown signal. The remaining degrees of freedom must be sampled according to the correct conditional data distribution. Existing projection-based samplers enforce measurement consistency by correcting the observed component during reverse diffusion. However, measurement consistency alone does not determine how probability mass should be distributed along the feasible set, and this can lead to biased conditional samples. We analyze this issue through a normal--tangent decomposition of the score function. For Gaussian noising, the observed-direction score is exactly determined by the measurement; only the tangent conditional score is unknown. We prove that the error from replacing this score by the unconditional tangent score is upper bounded by a dimension-free conditional mutual information between observed and unobserved components. This gives an information-theoretic decomposition into initialization and pathwise score-mismatch errors. Motivated by the theory, we propose a projected-Langevin initialization followed by guided reverse denoising, which outperforms a strong projection-based baseline in inpainting and super-resolution experiments.
LGApr 13
Continuous-time Online Learning via Mean-Field Neural Networks: Regret Analysis in Diffusion EnvironmentsErhan Bayraktar, Bingyan Han, Ziqing Zhang
We study continuous-time online learning where data are generated by a diffusion process with unknown coefficients. The learner employs a two-layer neural network, continuously updating its parameters in a non-anticipative manner. The mean-field limit of the learning dynamics corresponds to a stochastic Wasserstein gradient flow adapted to the data filtration. We establish regret bounds for both the mean-field limit and finite-particle system. Our analysis leverages the logarithmic Sobolev inequality, Polyak-Lojasiewicz condition, Malliavin calculus, and uniform-in-time propagation of chaos. Under displacement convexity, we obtain a constant static regret bound. In the general non-convex setting, we derive explicit linear regret bounds characterizing the effects of data variation, entropic exploration, and quadratic regularization. Finally, our simulations demonstrate the outperformance of the online approach and the impact of network width and regularization parameters.
PRSep 2, 2022
A PDE approach for regret bounds under partial monitoringErhan Bayraktar, Ibrahim Ekren, Xin Zhang
In this paper, we study a learning problem in which a forecaster only observes partial information. By properly rescaling the problem, we heuristically derive a limiting PDE on Wasserstein space which characterizes the asymptotic behavior of the regret of the forecaster. Using a verification type argument, we show that the problem of obtaining regret bounds and efficient algorithms can be tackled by finding appropriate smooth sub/supersolutions of this parabolic PDE.
LGMay 8
When Diffusion Model Can Ignore Dimension: An Entropy-Based TheoryAhmad Aghapour, Erhan Bayraktar
Diffusion models perform remarkably well on high-dimensional data such as images, often using only a modest number of reverse-time steps. Despite this practical success, existing convergence theory does not fully explain why such samplers remain efficient in high dimensions. Many prior KL guarantees bound the discretization error in terms of the ambient dimension, while other improved results replace this dependence using intrinsic-dimensional or geometric structure assumptions. In this work, we develop an alternative information-theoretic perspective on diffusion sampler convergence. We prove that, for Gaussian mixture targets, the discretization error is controlled by the Shannon entropy of the latent mixture component rather than by the ambient dimension. Consequently, the leading step complexity scales linearly with latent entropy and depends only logarithmically on the second moment of the data. Our analysis also extends to discrete target distributions, where the relevant complexity is the entropy of the target rather than the dimension of the embedding space. These results suggest that diffusion sampling can remain efficient in high-dimensional spaces when the data distribution admits a compact latent representation, as is widely believed to be the case for natural images.
LGNov 10, 2025
Deep Neural Operator Learning for Probabilistic ModelsErhan Bayraktar, Qi Feng, Zecheng Zhang et al.
We propose a deep neural-operator framework for a general class of probability models. Under global Lipschitz conditions on the operator over the entire Euclidean space-and for a broad class of probabilistic models-we establish a universal approximation theorem with explicit network-size bounds for the proposed architecture. The underlying stochastic processes are required only to satisfy integrability and general tail-probability conditions. We verify these assumptions for both European and American option-pricing problems within the forward-backward SDE (FBSDE) framework, which in turn covers a broad class of operators arising from parabolic PDEs, with or without free boundaries. Finally, we present a numerical example for a basket of American options, demonstrating that the learned model produces optimal stopping boundaries for new strike prices without retraining.
LGJan 29
Entropy-Based Dimension-Free Convergence and Loss-Adaptive Schedules for Diffusion ModelsAhmad Aghapour, Erhan Bayraktar, Ziqing Zhang
Diffusion generative models synthesize samples by discretizing reverse-time dynamics driven by a learned score (or denoiser). Existing convergence analyses of diffusion models typically scale at least linearly with the ambient dimension, and sharper rates often depend on intrinsic-dimension assumptions or other geometric restrictions on the target distribution. We develop an alternative, information-theoretic approach to dimension-free convergence that avoids any geometric assumptions. Under mild assumptions on the target distribution, we bound KL divergence between the target and generated distributions by $O(H^2/K)$ (up to endpoint factors), where $H$ is the Shannon entropy and $K$ is the number of sampling steps. Moreover, using a reformulation of the KL divergence, we propose a Loss-Adaptive Schedule (LAS) for efficient discretization of reverse SDE which is lightweight and relies only on the training loss, requiring no post-training heavy computation. Empirically, LAS improves sampling quality over common heuristic schedules.
OCFeb 1, 2025
Uniform-in-time weak propagation of chaos for consensus-based optimizationErhan Bayraktar, Ibrahim Ekren, Hongyi Zhou
We study the uniform-in-time weak propagation of chaos for the consensus-based optimization (CBO) method on a bounded searching domain. We apply the methodology for studying long-time behaviors of interacting particle systems developed in the work of Delarue and Tse (ArXiv:2104.14973). Our work shows that the weak error has order $O(N^{-1})$ uniformly in time, where $N$ denotes the number of particles. The main strategy behind the proofs are the decomposition of the weak errors using the linearized Fokker-Planck equations and the exponential decay of their Sobolev norms. Consequently, our result leads to the joint convergence of the empirical distribution of the CBO particle system to the Dirac-delta distribution at the global minimizer in population size and running time in Wasserstein-type metrics.
LGOct 31, 2020
Prediction against a limited adversaryErhan Bayraktar, Ibrahim Ekren, Xin Zhang
We study the problem of prediction with expert advice with adversarial corruption where the adversary can at most corrupt one expert. Using tools from viscosity theory, we characterize the long-time behavior of the value function of the game between the forecaster and the adversary. We provide lower and upper bounds for the growth rate of regret without relying on a comparison result. We show that depending on the description of regret, the limiting behavior of the game can significantly differ.
OCMar 18, 2020
Malicious Experts versus the multiplicative weights algorithm in online predictionErhan Bayraktar, H. Vincent Poor, Xin Zhang
We consider a prediction problem with two experts and a forecaster. We assume that one of the experts is honest and makes correct prediction with probability $μ$ at each round. The other one is malicious, who knows true outcomes at each round and makes predictions in order to maximize the loss of the forecaster. Assuming the forecaster adopts the classical multiplicative weights algorithm, we find upper and lower bounds for the value function of the malicious expert. Our results imply that the multiplicative weights algorithm cannot resist the corruption of malicious experts. We also show that an adaptive multiplicative weights algorithm is asymptotically optimal for the forecaster, and hence more resistant to the corruption of malicious experts.
PRNov 22, 2019
Finite-Time 4-Expert Prediction ProblemErhan Bayraktar, Ibrahim Ekren, Xin Zhang
We explicitly solve the nonlinear PDE that is the continuous limit of dynamic programming of \emph{expert prediction problem} in finite horizon setting with $N=4$ experts. The \emph{expert prediction problem} is formulated as a zero sum game between a player and an adversary. By showing that the solution is $\mathcal{C}^2$, we are able to show that the strategies conjectured in arXiv:1409.3040G form an asymptotic Nash equilibrium. We also prove the "Finite vs Geometric regret" conjecture proposed in arXiv:1409.3040G for $N=4$, and and show that this conjecture in fact follows from the conjecture that the comb strategies are optimal.
MLMar 27, 2019
On the Adversarial Robustness of Multivariate Robust EstimationErhan Bayraktar, Lifeng Lai
In this paper, we investigate the adversarial robustness of multivariate $M$-Estimators. In the considered model, after observing the whole dataset, an adversary can modify all data points with the goal of maximizing inference errors. We use adversarial influence function (AIF) to measure the asymptotic rate at which the adversary can change the inference result. We first characterize the adversary's optimal modification strategy and its corresponding AIF. From the defender's perspective, we would like to design an estimator that has a small AIF. For the case of joint location and scale estimation problem, we characterize the optimal $M$-estimator that has the smallest AIF. We further identify a tradeoff between robustness against adversarial modifications and robustness against outliers, and derive the optimal $M$-estimator that achieves the best tradeoff.
PRFeb 6, 2019
On the asymptotic optimality of the comb strategy for prediction with expert adviceErhan Bayraktar, Ibrahim Ekren, Yili Zhang
For the problem of prediction with expert advice in the adversarial setting with geometric stopping, we compute the exact leading order expansion for the long time behavior of the value function. Then, we use this expansion to prove that as conjectured in Gravin et al. [12], the comb strategies are indeed asymptotically optimal for the adversary in the case of 4 experts.
NASep 10, 2018
Convergence of implicit schemes for Hamilton-Jacobi-Bellman quasi-variational inequalitiesParsiad Azimzadeh, Erhan Bayraktar, George Labahn
In [Azimzadeh, P., and P. A. Forsyth. "Weakly chained matrices, policy iteration, and impulse control." SIAM J. Num. Anal. 54.3 (2016): 1341-1364], we outlined the theory and implementation of computational methods for implicit schemes for Hamilton-Jacobi-Bellman quasi-variational inequalities (HJBQVIs). No convergence proofs were given therein. This work closes the gap by giving rigorous proofs of convergence. We do so by introducing the notion of nonlocal consistency and appealing to a Barles-Souganidis type analysis. Our results rely only on a well-known comparison principle and are independent of the specific form of the intervention operator.