Adrienne Tuynman

h-index1

4papers

3citations

Novelty52%

AI Score38

Ranked #86,355 of 194,257 authors (top 44%)#19,155 in LG (top 48%)

4 Papers

1.2STJun 29

Optimal Posterior E-values with Non-Convex Parameter Sets with Applications to Voting Systems

Adrienne Tuynman, Timothée Mathieu

We are interested in conducting political polls sequentially, so that one can stop acquiring data as soon as possible while safely yielding statistically significant results. Building off e-values, which have recently become a useful tool to create sequential testing methods, we develop a theory of posterior optimal e-values. We use voting as a convenient example on which to illustrate our method. First, we design statistical tests for Condorcet and Borda voting system, and also for Schulze voting system which we are the first to tackle statistically. Then, we study the construction of optimal sequential e-values in the deceptively simple setting of multivariate Bernoulli data, with general composite null and alternative hypothesis sets $\mathcal{H}_0$ and $\mathcal{H}_1$. We give a way to compute these e-values using an efficient Frank-Wolfe algorithm, giving a pretty general way to compute Reverse Information Projections, even when $\mathcal{H}_0$ corresponds to a non-convex parameter set. Finally, we illustrate the efficiency, both in terms of power and sample size of our method. We compare with state of the art in both simulated and real data experiments, with application to French 2022 presidential election data.

9.4LGFeb 3, 2025

The Batch Complexity of Bandit Pure Exploration

Adrienne Tuynman, Rémy Degenne

In a fixed-confidence pure exploration problem in stochastic multi-armed bandits, an algorithm iteratively samples arms and should stop as early as possible and return the correct answer to a query about the arms distributions. We are interested in batched methods, which change their sampling behaviour only a few times, between batches of observations. We give an instance-dependent lower bound on the number of batches used by any sample efficient algorithm for any pure exploration task. We then give a general batched algorithm and prove upper bounds on its expected sample complexity and batch complexity. We illustrate both lower and upper bounds on best-arm identification and thresholding bandits.

4.1LGOct 15, 2025

Towards Blackwell Optimality: Bellman Optimality Is All You Can Get

Victor Boone, Adrienne Tuynman

Although average gain optimality is a commonly adopted performance measure in Markov Decision Processes (MDPs), it is often too asymptotic. Further incorporating measures of immediate losses leads to the hierarchy of bias optimalities, all the way up to Blackwell optimality. In this paper, we investigate the problem of identifying policies of such optimality orders. To that end, for each order, we construct a learning algorithm with vanishing probability of error. Furthermore, we characterize the class of MDPs for which identification algorithms can stop in finite time. That class corresponds to the MDPs with a unique Bellman optimal policy, and does not depend on the optimality order considered. Lastly, we provide a tractable stopping rule that when coupled to our learning algorithm triggers in finite time whenever it is possible to do so.

3.3LGFeb 2, 2022

Transfer in Reinforcement Learning via Regret Bounds for Learning Agents

Adrienne Tuynman, Ronald Ortner

We present an approach for the quantification of the usefulness of transfer in reinforcement learning via regret bounds for a multi-agent setting. Considering a number of $\aleph$ agents operating in the same Markov decision process, however possibly with different reward functions, we consider the regret each agent suffers with respect to an optimal policy maximizing her average reward. We show that when the agents share their observations the total regret of all agents is smaller by a factor of $\sqrt{\aleph}$ compared to the case when each agent has to rely on the information collected by herself. This result demonstrates how considering the regret in multi-agent settings can provide theoretical bounds on the benefit of sharing observations in transfer learning.