1.1ITApr 20
The Noisy Quantitative Group Testing ProblemTenghao Li, Neha Sangwan, Xiaxin Li et al.
In this paper, we study the problem of quantitative group testing (QGT) and analyze the performance of three models: the noiseless model, the additive Gaussian noise model, and the noisy Z-channel model. For each model, we analyze two algorithmic approaches: a linear estimator based on correlation scores, and a least squares estimator (LSE). We derive upper bounds on the number of tests required for exact recovery with vanishing error probability, and complement these results with information-theoretic lower bounds. In the additive Gaussian noise setting, our lower and upper bounds match in order.
MLFeb 21, 2025
Exact Recovery of Sparse Binary Vectors from Generalized Linear MeasurementsArya Mazumdar, Neha Sangwan
We consider the problem of exact recovery of a $k$-sparse binary vector from generalized linear measurements (such as logistic regression). We analyze the linear estimation algorithm (Plan, Vershynin, Yudovina, 2017), and also show information theoretic lower bounds on the number of required measurements. As a consequence of our results, for noisy one bit quantized linear measurements ($\mathsf{1bCSbinary}$), we obtain a sample complexity of $O((k+σ^2)\log{n})$, where $σ^2$ is the noise variance. This is shown to be optimal due to the information theoretic lower bound. We also obtain tight sample complexity characterization for logistic regression. Since $\mathsf{1bCSbinary}$ is a strictly harder problem than noisy linear measurements ($\mathsf{SparseLinearReg}$) because of added quantization, the same sample complexity is achievable for $\mathsf{SparseLinearReg}$. While this sample complexity can be obtained via the popular lasso algorithm, linear estimation is computationally more efficient. Our lower bound holds for any set of measurements for $\mathsf{SparseLinearReg}$, (similar bound was known for Gaussian measurement matrices) and is closely matched by the maximum-likelihood upper bound. For $\mathsf{SparseLinearReg}$, it was conjectured in Gamarnik and Zadik, 2017 that there is a statistical-computational gap and the number of measurements should be at least $(2k+σ^2)\log{n}$ for efficient algorithms to exist. It is worth noting that our results imply that there is no such statistical-computational gap for $\mathsf{1bCSbinary}$ and logistic regression.