Manuj Mukherjee

h-index2

4papers

2citations

Novelty60%

AI Score43

Ranked #79,037 of 201,326 authors (top 39%)#312 in IT (top 36%)

4 Papers

32.9ITJun 1

Query-Limited Community Recovery in Stochastic Block Models

Sabyasachi Basu, Manuj Mukherjee, Lutz Oettershagen et al.

We study exact community recovery in the two-community stochastic block model on $n$ vertices under limited and noisy access to network data. The learner may query a noisy neighborhood oracle that reveals each true neighbor of a queried vertex independently with fixed probability and never returns non-neighbors, subject to a finite query budget. We consider both oracle-only access and a combined model where the learner also observes a single subsampled copy of the underlying graph. For oracle-only access, balanced uniform querying gives a sharp non-adaptive benchmark: when each vertex is queried the same integer number of times, the observations reduce to an SBM with attenuated edge probabilities and the Abbe-Bandeira-Hall exact-recovery threshold applies. We show that this benchmark is not adaptively optimal: a two-stage adaptive strategy succeeds with $n+o(n)$ queries in a regime where balanced uniform querying requires $m n$ queries for some $m>1$. With an additional subsampled graph, we prove a sublinear-query adaptivity gap: balanced data-independent uniform querying with a sublinear budget does not improve over the subsampled graph alone, whereas adaptive querying can target a small set of uncertain vertices and achieve exact recovery. Thus adaptive data acquisition can strictly improve the information-theoretic limits of exact recovery.

10.3ITMar 30

Perfect Secret Key Generation for a class of Hypergraphical Sources

Manuj Mukherjee, Sagnik Chatterjee, Alhad Sethi

Nitinawarat and Narayan proposed a perfect secret key generation scheme for the so-called \emph{pairwise independent network (PIN) model} by exploiting the combinatorial properties of the underlying graph, namely the spanning tree packing rate. This work considers a generalization of the PIN model where the underlying graph is replaced with a hypergraph, and makes progress towards designing similar perfect secret key generation schemes by exploiting the combinatorial properties of the hypergraph. Our contributions are two-fold. We first provide a capacity achieving scheme for a complete $t$-uniform hypergraph on $m$ vertices by leveraging a packing of the complete $t$-uniform hypergraphs by what we refer to as star hypergraphs, and designing a scheme that gives $\binom{m-2}{t-2}$ bits of perfect secret key per star graph. Our second contribution is a 2-bit perfect secret key generation scheme for 3-uniform star hypergraphs whose projections are cycles. This scheme is then extended to a perfect secret key generation scheme for generic 3-uniform hypergraphs by exploiting star graph packing of 3-uniform hypergraphs and Hamiltonian packings of graphs. The scheme is then shown to be capacity achieving for certain classes of hypergraphs.

LGMay 22, 2024

Generalization Bounds for Dependent Data using Online-to-Batch Conversion

Sagnik Chatterjee, Manuj Mukherjee, Alhad Sethi

In this work, we upper bound the generalization error of batch learning algorithms trained on samples drawn from a mixing stochastic process (i.e., a dependent data source) both in expectation and with high probability. Unlike previous results by Mohri et al. (2010) and Fu et al. (2023), our work does not require any stability assumptions on the batch learner, which allows us to derive upper bounds for any batch learning algorithm trained on dependent data. This is made possible due to our use of the Online-to-Batch ( OTB ) conversion framework, which allows us to shift the burden of stability from the batch learner to an artificially constructed online learner. We show that our bounds are equal to the bounds in the i.i.d. setting up to a term that depends on the decay rate of the underlying mixing stochastic process. Central to our analysis is a new notion of algorithmic stability for online learning algorithms based on Wasserstein distances of order one. Furthermore, we prove that the EWA algorithm, a textbook family of online learning algorithms, satisfies our new notion of stability. Following this, we instantiate our bounds using the EWA algorithm.

LGJan 25, 2021

Approximating Probability Distributions by ReLU Networks

Manuj Mukherjee, Aslan Tchamkerten, Mansoor Yousefi

How many neurons are needed to approximate a target probability distribution using a neural network with a given input distribution and approximation error? This paper examines this question for the case when the input distribution is uniform, and the target distribution belongs to the class of histogram distributions. We obtain a new upper bound on the number of required neurons, which is strictly better than previously existing upper bounds. The key ingredient in this improvement is an efficient construction of the neural nets representing piecewise linear functions. We also obtain a lower bound on the minimum number of neurons needed to approximate the histogram distributions.