Gaussian Approximation of Quantization Error for Estimation from Compressed Data
This work addresses the challenge of analyzing estimation from compressed data, providing a theoretical framework that bridges compression and noise models for researchers in signal processing and machine learning.
The paper tackles the problem of connecting lossy compression to additive Gaussian noise for high-dimensional signals, showing that the Wasserstein distance between compressed and noisy versions grows sub-linearly with dimension. This connection enables deriving new results for inference under compression, such as minimax estimation and sparse regression.
We consider the distributional connection between the lossy compressed representation of a high-dimensional signal $X$ using a random spherical code and the observation of $X$ under an additive white Gaussian noise (AWGN). We show that the Wasserstein distance between a bitrate-$R$ compressed version of $X$ and its observation under an AWGN-channel of signal-to-noise ratio $2^{2R}-1$ is sub-linear in the problem dimension. We utilize this fact to connect the risk of an estimator based on an AWGN-corrupted version of $X$ to the risk attained by the same estimator when fed with its bitrate-$R$ quantized version. We demonstrate the usefulness of this connection by deriving various novel results for inference problems under compression constraints, including minimax estimation, sparse regression, compressed sensing, and the universality of linear estimation in remote source coding.