MLDSLGFeb 2, 2024

Query-Efficient Correlation Clustering with Noisy Oracle

arXiv:2402.01400v28 citationsh-index: 6NIPS
AI Analysis

This addresses the challenge of efficient clustering in domains with costly and noisy similarity computations, representing an incremental advance in algorithm design for combinatorial bandits.

The paper tackles the problem of clustering elements with minimal queries to a noisy similarity oracle, introducing algorithms based on Pure Exploration in Combinatorial Multi-Armed Bandits for fixed confidence and fixed budget settings, achieving polynomial-time solutions for an NP-hard offline optimization problem.

We study a general clustering setting in which we have $n$ elements to be clustered, and we aim to perform as few queries as possible to an oracle that returns a noisy sample of the weighted similarity between two elements. Our setting encompasses many application domains in which the similarity function is costly to compute and inherently noisy. We introduce two novel formulations of online learning problems rooted in the paradigm of Pure Exploration in Combinatorial Multi-Armed Bandits (PE-CMAB): fixed confidence and fixed budget settings. For both settings, we design algorithms that combine a sampling strategy with a classic approximation algorithm for correlation clustering and study their theoretical guarantees. Our results are the first examples of polynomial-time algorithms that work for the case of PE-CMAB in which the underlying offline optimization problem is NP-hard.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes