Yingkai Li

6papers

167citations

Novelty46%

AI Score40

Ranked #95,098 of 205,806 authors (top 46%)#21,025 in LG (top 50%)

6 Papers

4.5THMay 5

Going Public: Communication in Collective Decisions

Zhicheng Du, Yingkai Li, Boli Xu

A principal and $n\ge 2$ agents can launch a project if the principal proposes it and at least $k$ agents accept. Their individual payoffs from the project depend on an ex ante unknown state. The principal can conduct a test to learn about the state and then communicate her findings to the agents via cheap talk. This paper focuses on comparing two communication regimes: public and private messaging. We show that public messaging is weakly dominant: any outcome implementable under private messaging can also be implemented under public messaging. Moreover, in a canonical environment with linear payoffs, we characterize the principal's optimal test in each regime and show that public messaging can be strictly dominant if and only if there exist two agents who are the principal's conflicting allies.

LGJul 9, 2020

Multinomial Logit Bandit with Low Switching Cost

Kefan Dong, Yingkai Li, Qin Zhang et al.

We study multinomial logit bandit with limited adaptivity, where the algorithms change their exploration actions as infrequently as possible when achieving almost optimal minimax regret. We propose two measures of adaptivity: the assortment switching cost and the more fine-grained item switching cost. We present an anytime algorithm (AT-DUCB) with $O(N \log T)$ assortment switches, almost matching the lower bound $Ω(\frac{N \log T}{ \log \log T})$. In the fixed-horizon setting, our algorithm FH-DUCB incurs $O(N \log \log T)$ assortment switches, matching the asymptotic lower bound. We also present the ESUCB algorithm with item switching cost $O(N \log^2 T)$.

LGSep 4, 2019

Stochastic Linear Optimization with Adversarial Corruption

Yingkai Li, Edmund Y. Lou, Liren Shan

We extend the model of stochastic bandits with adversarial corruption (Lykouriset al., 2018) to the stochastic linear optimization problem (Dani et al., 2008). Our algorithm is agnostic to the amount of corruption chosen by the adaptive adversary. The regret of the algorithm only increases linearly in the amount of corruption. Our algorithm involves using Löwner-John's ellipsoid for exploration and dividing time horizon into epochs with exponentially increasing size to limit the influence of corruption.

MLMay 4, 2019

Tight Regret Bounds for Infinite-armed Linear Contextual Bandits

Yingkai Li, Yining Wang, Xi Chen et al.

Linear contextual bandit is an important class of sequential decision making problems with a wide range of applications to recommender systems, online advertising, healthcare, and many other machine learning related tasks. While there is a lot of prior research, tight regret bounds of linear contextual bandit with infinite action sets remain open. In this paper, we address this open problem by considering the linear contextual bandit with (changing) infinite action sets. We prove a regret upper bound on the order of $O(\sqrt{d^2T\log T})\times \text{poly}(\log\log T)$ where $d$ is the domain dimension and $T$ is the time horizon. Our upper bound matches the previous lower bound of $Ω(\sqrt{d^2 T\log T})$ in [Li et al., 2019] up to iterated logarithmic terms.

MLMar 30, 2019

Nearly Minimax-Optimal Regret for Linearly Parameterized Bandits

Yingkai Li, Yining Wang, Yuan Zhou

We study the linear contextual bandit problem with finite action sets. When the problem dimension is $d$, the time horizon is $T$, and there are $n \leq 2^{d/2}$ candidate actions per time period, we (1) show that the minimax expected regret is $Ω(\sqrt{dT (\log T) (\log n)})$ for every algorithm, and (2) introduce a Variable-Confidence-Level (VCL) SupLinUCB algorithm whose regret matches the lower bound up to iterated logarithmic factors. Our algorithmic result saves two $\sqrt{\log T}$ factors from previous analysis, and our information-theoretical lower bound also improves previous results by one $\sqrt{\log T}$ factor, revealing a regret scaling quite different from classical multi-armed bandits in which no logarithmic $T$ term is present in minimax regret. Our proof techniques include variable confidence levels and a careful analysis of layer sizes of SupLinUCB on the upper bound side, and delicately constructed adversarial sequences showing the tightness of elliptical potential lemmas on the lower bound side.

LGMay 7, 2018

Implementation of Stochastic Quasi-Newton's Method in PyTorch

Yingkai Li, Huidong Liu

In this paper, we implement the Stochastic Damped LBFGS (SdLBFGS) for stochastic non-convex optimization. We make two important modifications to the original SdLBFGS algorithm. First, by initializing the Hessian at each step using an identity matrix, the algorithm converges better than original algorithm. Second, by performing direction normalization we could gain stable optimization procedure without line search. Experiments on minimizing a 2D non-convex function shows that our improved algorithm converges better than original algorithm, and experiments on the CIFAR10 and MNIST datasets show that our improved algorithm works stably and gives comparable or even better testing accuracies than first order optimizers SGD, Adagrad, and second order optimizers LBFGS in PyTorch.