MLLGNov 16, 2021

Sequential Community Mode Estimation

arXiv:2111.08535v1
Originality Incremental advance
AI Analysis

This addresses a sequential decision-making problem for community detection in populations, with applications in social network analysis or surveys, but it is incremental as it builds on existing sampling and estimation frameworks.

The paper tackles the problem of identifying the largest community in a partitioned population via sequential random sampling from multiple boxes, aiming to minimize mis-identification probability under a fixed budget. It proposes novel algorithms with optimal error decay rates in many cases, validated by simulations on real-world datasets.

We consider a population, partitioned into a set of communities, and study the problem of identifying the largest community within the population via sequential, random sampling of individuals. There are multiple sampling domains, referred to as \emph{boxes}, which also partition the population. Each box may consist of individuals of different communities, and each community may in turn be spread across multiple boxes. The learning agent can, at any time, sample (with replacement) a random individual from any chosen box; when this is done, the agent learns the community the sampled individual belongs to, and also whether or not this individual has been sampled before. The goal of the agent is to minimize the probability of mis-identifying the largest community in a \emph{fixed budget} setting, by optimizing both the sampling strategy as well as the decision rule. We propose and analyse novel algorithms for this problem, and also establish information theoretic lower bounds on the probability of error under any algorithm. In several cases of interest, the exponential decay rates of the probability of error under our algorithms are shown to be optimal up to constant factors. The proposed algorithms are further validated via simulations on real-world datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes