MLLGJul 31, 2021

Pure Exploration and Regret Minimization in Matching Bandits

arXiv:2108.00230v15 citations
Originality Incremental advance
AI Analysis

This addresses a combinatorial optimization problem for machine learning and algorithmic decision-making, with incremental improvements in efficiency.

The paper tackles the problem of finding an optimal matching in weighted graphs under a semi-bandit setting, showing that leveraging a rank-1 assumption on the adjacency matrix reduces sample complexity and regret to near-linear dependency in the number of vertices.

Finding an optimal matching in a weighted graph is a standard combinatorial problem. We consider its semi-bandit version where either a pair or a full matching is sampled sequentially. We prove that it is possible to leverage a rank-1 assumption on the adjacency matrix to reduce the sample complexity and the regret of off-the-shelf algorithms up to reaching a linear dependency in the number of vertices (up to poly log terms).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes