Simai He

h-index1
2papers

2 Papers

MLDec 21, 2023
Best Arm Identification in Batched Multi-armed Bandit Problems

Shengyu Cao, Simai He, Ruoqing Jiang et al.

Recently multi-armed bandit problem arises in many real-life scenarios where arms must be sampled in batches, due to limited time the agent can wait for the feedback. Such applications include biological experimentation and online marketing. The problem is further complicated when the number of arms is large and the number of batches is small. We consider pure exploration in a batched multi-armed bandit problem. We introduce a general linear programming framework that can incorporate objectives of different theoretical settings in best arm identification. The linear program leads to a two-stage algorithm that can achieve good theoretical properties. We demonstrate by numerical studies that the algorithm also has good performance compared to certain UCB-type or Thompson sampling methods.

OCJan 2, 2007
Semidefnite Relaxation Bounds for Indefinite Homogeneous Quadratic Optimization

Simai He, Zhi-Quan Luo, Jiawang Nie et al.

In this paper we study the relationship between the optimal value of a homogeneous quadratic optimization problem and that of its Semidefinite Programming (SDP) relaxation. We consider two quadratic optimization models: (1) $\min \{x^* C x \mid x^* A_k x \ge 1, x\in\mathbb{F}^n, k=0,1,...,m\}$; and (2) $\max \{x^* C x \mid x^* A_k x \le 1, x\in\mathbb{F}^n, k=0,1,...,m\}$. If \emph{one} of $A_k$'s is indefinite while others and $C$ are positive semidefinite, we prove that the ratio between the optimal value of (1) and its SDP relaxation is upper bounded by $O(m^2)$ when $\mathbb{F}$ is the real line $\mathbb{R}$, and by $O(m)$ when $\mathbb{F}$ is the complex plane $\mathbb{C}$. This result is an extension of the recent work of Luo {\em et al.} \cite{LSTZ}. For (2), we show that the same ratio is bounded from below by $O(1/\log m)$ for both the real and complex case, whenever all but one of $A_k$'s are positive semidefinite while $C$ can be indefinite. This result improves the so-called approximate S-Lemma of Ben-Tal {\em et al.} \cite{BNR02}. We also consider (2) with multiple indefinite quadratic constraints and derive a general bound in terms of the problem data and the SDP solution. Throughout the paper, we present examples showing that all of our results are essentially tight.