QUANT-PHJul 3, 2023
Quantum Neural Estimation of EntropiesZiv Goldfeld, Dhrumil Patel, Sreejith Sreekumar et al.
Entropy measures quantify the amount of information and correlation present in a quantum system. In practice, when the quantum state is unknown and only copies thereof are available, one must resort to the estimation of such entropy measures. Here we propose a variational quantum algorithm for estimating the von Neumann and Rényi entropies, as well as the measured relative entropy and measured Rényi relative entropy. Our approach first parameterizes a variational formula for the measure of interest by a quantum circuit and a classical neural network, and then optimizes the resulting objective over parameter space. Numerical simulations of our quantum algorithm are provided, using a noiseless quantum simulator. The algorithm provides accurate estimates of the various entropy measures for the examples tested, which renders it as a promising approach for usage in downstream tasks.
32.6QUANT-PHApr 28
Distributed Quantum Hypothesis Testing under Zero-rate Communication ConstraintsSreejith Sreekumar, Christoph Hirche, Hao-Chung Cheng et al.
The trade-offs between error probabilities in quantum hypothesis testing are by now well-understood in the centralized setting, but much less is known for distributed settings. Here, we study a distributed binary hypothesis testing problem to infer a bipartite quantum state shared between two remote parties, where one of these parties communicates to the tester at (asymptotic) zero-rate, while the other party communicates to the tester at zero-rate or higher. As our main contribution, we derive an efficiently computable single-letter formula for the Stein's exponent of this problem, when the state under the alternative is the product of their marginals. For proving the converse direction of our result, we utilize a novel technique based on reverse hypercontractivity of a quantum markov semigroup combined with the pinching method. For the general case with vanishing type I error probability, we show that the Stein's exponent when (at least) one of the parties communicates classically at zero-rate is given by a multi-letter expression involving regularized measured relative entropy maximized over a sub-class of binary outcome separable measurements. When the state under the alternative commutes with the product of marginals under the null and has a larger support, we show that the exponent is characterized as a max-min optimization of regularized measured relative entropy over a class of local binary outcome projective measurements. While this expression becomes single-letter for the fully classical case, we further prove that this already does not happen in the same way for classical-quantum states in general. The converse proof of the max-min characterization relies on an extension of the classical blowing-up lemma to bipartite quantum states whose marginals commute, which could be of independent interest.
15.6ITMar 25
One-shot Multiple Access Channel SimulationAditya Nema, Sreejith Sreekumar, Mario Berta
We consider the problem of shared randomness-assisted multiple access channel (MAC) simulation for product inputs and characterize the one-shot communication cost region via almost-matching inner and outer bounds in terms of the smooth max-information of the channel, featuring auxiliary random variables of bounded size. The achievability relies on a rejection-sampling algorithm to simulate an auxiliary channel between each sender and the decoder, and producing the final output based on the output of these intermediate channels. The converse follows via information-spectrum based arguments. To bound the cardinality of the auxiliary random variables, we employ the perturbation method from [Anantharam et al., IEEE Trans. Inf. Theory (2019)] in the one-shot setting. For the asymptotic setting and vanishing errors, our result expands to a tight single-letter rate characterization and consequently extends a special case of the simulation results of [Kurri et al., IEEE Trans. Inf. Theory (2022)] for fixed, independent and identically distributed (iid) product inputs to universal simulation for any product inputs. We broaden our discussion into the quantum realm by studying feedback simulation of quantum-to-classical (QC) MACs with product measurements [Atif et al., IEEE Trans. Inf. Theory (2022)]. For fixed product inputs and with shared randomness assistance, we give a quasi tight one-shot communication cost region with corresponding single-letter asymptotic iid expansion.
ITFeb 20
Quantum Maximum Likelihood Prediction via Hilbert Space EmbeddingsSreejith Sreekumar, Nir Weinberger
Recent works have proposed various explanations for the ability of modern large language models (LLMs) to perform in-context prediction. We propose an alternative conceptual viewpoint from an information-geometric and statistical perspective. Motivated by Bach[2023], we model training as learning an embedding of probability distributions into the space of quantum density operators, and in-context learning as maximum-likelihood prediction over a specified class of quantum models. We provide an interpretation of this predictor in terms of quantum reverse information projection and quantum Pythagorean theorem when the class of quantum models is sufficiently expressive. We further derive non-asymptotic performance guarantees in terms of convergence rates and concentration inequalities, both in trace norm and quantum relative entropy. Our approach provides a unified framework to handle both classical and quantum LLMs.
QUANT-PHNov 24, 2025
Performance Guarantees for Quantum Neural Estimation of EntropiesSreejith Sreekumar, Ziv Goldfeld, Mark M. Wilde
Estimating quantum entropies and divergences is an important problem in quantum physics, information theory, and machine learning. Quantum neural estimators (QNEs), which utilize a hybrid classical-quantum architecture, have recently emerged as an appealing computational framework for estimating these measures. Such estimators combine classical neural networks with parametrized quantum circuits, and their deployment typically entails tedious tuning of hyperparameters controlling the sample size, network architecture, and circuit topology. This work initiates the study of formal guarantees for QNEs of measured (Rényi) relative entropies in the form of non-asymptotic error risk bounds. We further establish exponential tail bounds showing that the error is sub-Gaussian, and thus sharply concentrates about the ground truth value. For an appropriate sub-class of density operator pairs on a space of dimension $d$ with bounded Thompson metric, our theory establishes a copy complexity of $O(|Θ(\mathcal{U})|d/ε^2)$ for QNE with a quantum circuit parameter set $Θ(\mathcal{U})$, which has minimax optimal dependence on the accuracy $ε$. Additionally, if the density operator pairs are permutation invariant, we improve the dimension dependence above to $O(|Θ(\mathcal{U})|\mathrm{polylog}(d)/ε^2)$. Our theory aims to facilitate principled implementation of QNEs for measured relative entropies and guide hyperparameter tuning in practice.
STOct 7, 2021
Neural Estimation of Statistical DivergencesSreejith Sreekumar, Ziv Goldfeld
Statistical divergences (SDs), which quantify the dissimilarity between probability distributions, are a basic constituent of statistical inference and machine learning. A modern method for estimating those divergences relies on parametrizing an empirical variational form by a neural network (NN) and optimizing over parameter space. Such neural estimators are abundantly used in practice, but corresponding performance guarantees are partial and call for further exploration. We establish non-asymptotic absolute error bounds for a neural estimator realized by a shallow NN, focusing on four popular $\mathsf{f}$-divergences -- Kullback-Leibler, chi-squared, squared Hellinger, and total variation. Our analysis relies on non-asymptotic function approximation theorems and tools from empirical process theory to bound the two sources of error involved: function approximation and empirical estimation. The bounds characterize the effective error in terms of NN size and the number of samples, and reveal scaling rates that ensure consistency. For compactly supported distributions, we further show that neural estimators of the first three divergences above with appropriate NN growth-rate are minimax rate-optimal, achieving the parametric convergence rate.
STMar 11, 2021
Non-Asymptotic Performance Guarantees for Neural Estimation of $\mathsf{f}$-DivergencesSreejith Sreekumar, Zhengxin Zhang, Ziv Goldfeld
Statistical distances (SDs), which quantify the dissimilarity between probability distributions, are central to machine learning and statistics. A modern method for estimating such distances from data relies on parametrizing a variational form by a neural network (NN) and optimizing it. These estimators are abundantly used in practice, but corresponding performance guarantees are partial and call for further exploration. In particular, there seems to be a fundamental tradeoff between the two sources of error involved: approximation and estimation. While the former needs the NN class to be rich and expressive, the latter relies on controlling complexity. This paper explores this tradeoff by means of non-asymptotic error bounds, focusing on three popular choices of SDs -- Kullback-Leibler divergence, chi-squared divergence, and squared Hellinger distance. Our analysis relies on non-asymptotic function approximation theorems and tools from empirical process theory. Numerical results validating the theory are also provided.
SPSep 28, 2020
Communicate to Learn at the EdgeDeniz Gunduz, David Burth Kurka, Mikolaj Jankowski et al.
Bringing the success of modern machine learning (ML) techniques to mobile devices can enable many new services and businesses, but also poses significant technical and research challenges. Two factors that are critical for the success of ML algorithms are massive amounts of data and processing power, both of which are plentiful, yet highly distributed at the network edge. Moreover, edge devices are connected through bandwidth- and power-limited wireless links that suffer from noise, time-variations, and interference. Information and coding theory have laid the foundations of reliable and efficient communications in the presence of channel imperfections, whose application in modern wireless networks have been a tremendous success. However, there is a clear disconnect between the current coding and communication schemes, and the ML algorithms deployed at the network edge. In this paper, we challenge the current approach that treats these problems separately, and argue for a joint communication and learning paradigm for both the training and inference stages of edge learning.