ITNov 15, 2023
On the Computation of the Gaussian Rate-Distortion-Perception FunctionGiuseppe Serra, Photios A. Stavrou, Marios Kountouris
In this paper, we study the computation of the rate-distortion-perception function (RDPF) for a multivariate Gaussian source under mean squared error (MSE) distortion and, respectively, Kullback-Leibler divergence, geometric Jensen-Shannon divergence, squared Hellinger distance, and squared Wasserstein-2 distance perception metrics. To this end, we first characterize the analytical bounds of the scalar Gaussian RDPF for the aforementioned divergence functions, also providing the RDPF-achieving forward "test-channel" realization. Focusing on the multivariate case, we establish that, for tensorizable distortion and perception metrics, the optimal solution resides on the vector space spanned by the eigenvector of the source covariance matrix. Consequently, the multivariate optimization problem can be expressed as a function of the scalar Gaussian RDPFs of the source marginals, constrained by global distortion and perception levels. Leveraging this characterization, we design an alternating minimization scheme based on the block nonlinear Gauss-Seidel method, which optimally solves the problem while identifying the Gaussian RDPF-achieving realization. Furthermore, the associated algorithmic embodiment is provided, as well as the convergence and the rate of convergence characterization. Lastly, for the "perfect realism" regime, the analytical solution for the multivariate Gaussian RDPF is obtained. We corroborate our results with numerical simulations and draw connections to existing results.
ITNov 18, 2015
Information Nonanticipative Rate Distortion Function and Its ApplicationsPhotios A. Stavrou, Christos K. Kourtellaris, C. D. Charalambous
This paper investigates applications of nonanticipative Rate Distortion Function (RDF) in a) zero-delay Joint Source-Channel Coding (JSCC) design based on average and excess distortion probability, b) in bounding the Optimal Performance Theoretically Attainable (OPTA) by noncausal and causal codes, and computing the Rate Loss (RL) of zero-delay and causal codes with respect to noncausal codes. These applications are described using two running examples, the Binary Symmetric Markov Source with parameter p, (BSMS(p)) and the multidimensional partially observed Gaussian-Markov source. For the multidimensional Gaussian-Markov source with square error distortion, the solution of the nonanticipative RDF is derived, its operational meaning using JSCC design via a noisy coding theorem is shown by providing the optimal encoding-decoding scheme over a vector Gaussian channel, and the RL of causal and zero-delay codes with respect to noncausal codes is computed. For the BSMS(p) with Hamming distortion, the solution of the nonanticipative RDF is derived, the RL of causal codes with respect to noncausal codes is computed, and an uncoded noisy coding theorem based on excess distortion probability is shown. The information nonanticipative RDF is shown to be equivalent to the nonanticipatory epsilon-entropy, which corresponds to the classical RDF with an additional causality or nonanticipative condition imposed on the optimal reproduction conditional distribution.
ITJan 18, 2014
Nonanticipative Rate Distortion Function and Filtering Theory: A weak Convergence ApproachPhotios A. Stavrou, Charalambos D. Charalambous
In this paper the relation between nonanticipative rate distortion function (RDF) and Bayesian filtering theory is further investigated on general Polish spaces. The relation is established via an optimization on the space of conditional distributions of the so-called directed information subject to fidelity constraints. Existence of the optimal reproduction distribution of the nonanticipative RDF is shown using the topology of weak convergence of probability measures. Subsequently, we use the solution of the nonanticipative RDF to present the realization of a multidimensional partially observable source over a scalar Gaussian channel. We show that linear encoders are optimal, establishing joint source-channel coding in real-time.
ITApr 25, 2013
On the relation of nonanticipative rate distortion function and filtering theoryCharalambos D. Charalambous, Photios A. Stavrou
In this paper the relation between nonanticipative rate distortion function (RDF) and Bayesian filtering theory is investigated using the topology of weak convergence of probability measures on Polish spaces. The relation is established via an optimization on the space of conditional distributions of the so-called directed information subject to fidelity constraints. Existence of the optimal reproduction distribution of the nonanticipative RDF is shown, while the optimal nonanticipative reproduction conditional distribution for stationary processes is derived in closed form. The realization procedure of nonanticipative RDF which is equivalent to joint-source channel matching for symbol-by-symbol transmission is described, while an example is introduced to illustrate the concepts.
ITJan 28, 2013
Optimal Nonstationary Reproduction Distribution for Nonanticipative RDF on Abstract AlphabetsPhotios A. Stavrou, Charalambos D. Charalambous, Christos K. Kourtellaris
In this paper we introduce a definition for nonanticipative Rate Distortion Function (RDF) on abstract alphabets, and we invoke weak convergence of probability measures to show various of its properties, such as, existence of the optimal reproduction conditional distribution, compactness of the fidelity set, lower semicontinuity of the RDF functional, etc. Further, we derive the closed form expression of the optimal nonstationary reproduction distribution. This expression is computed recursively backward in time. Throughout the paper we point out an operational meaning of the nonanticipative RDF by recalling the coding theorem derive in \cite{tatikonda2000}, and we state relations to Gorbunov-Pinsker's nonanticipatory $ε-$entropy \cite{gorbunov-pinsker}.
ITAug 27, 2024
Alternating Minimization Schemes for Computing Rate-Distortion-Perception Functions with $f$-Divergence Perception ConstraintsGiuseppe Serra, Photios A. Stavrou, Marios Kountouris
We study the computation of the rate-distortion-perception function (RDPF) for discrete memoryless sources subject to a single-letter average distortion constraint and a perception constraint belonging to the family of $f$-divergences. In this setting, the RDPF forms a convex programming problem for which we characterize optimal parametric solutions. We employ the developed solutions in an alternating minimization scheme, namely Optimal Alternating Minimization (OAM), for which we provide convergence guarantees. Nevertheless, the OAM scheme does not lead to a direct implementation of a generalized Blahut-Arimoto (BA) type of algorithm due to implicit equations in the iteration's structure. To overcome this difficulty, we propose two alternative minimization approaches whose applicability depends on the smoothness of the used perception metric: a Newton-based Alternating Minimization (NAM) scheme, relying on Newton's root-finding method for the approximation of the optimal solution of the iteration, and a Relaxed Alternating Minimization (RAM) scheme, based on relaxing the OAM iterates. We show, by deriving necessary and sufficient conditions, that both schemes guarantee convergence to a globally optimal solution. We also provide sufficient conditions on the distortion and perception constraints, which guarantee that the proposed algorithms converge exponentially fast in the number of iteration steps. We corroborate our theoretical results with numerical simulations and establish connections with existing results.
LGDec 16, 2024
Information-Geometric Barycenters for Bayesian Federated LearningNour Jamoussi, Giuseppe Serra, Photios A. Stavrou et al.
Federated learning (FL) is a widely used and impactful distributed optimization framework that achieves consensus through averaging locally trained models. While effective, this approach may not align well with Bayesian inference, where the model space has the structure of a distribution space. Taking an information-geometric perspective, we reinterpret FL aggregation as the problem of finding the barycenter of local posteriors using a prespecified divergence metric, minimizing the average discrepancy across clients. This perspective provides a unifying framework that generalizes many existing methods and offers crisp insights into their theoretical underpinnings. We then propose BA-BFL, an algorithm that retains the convergence properties of Federated Averaging in non-convex settings. In non-independent and identically distributed scenarios, we conduct extensive comparisons with statistical aggregation techniques, showing that BA-BFL achieves performance comparable to state-of-the-art methods while offering a geometric interpretation of the aggregation phase. Additionally, we extend our analysis to Hybrid Bayesian Deep Learning, exploring the impact of Bayesian layers on uncertainty quantification and model calibration.
LGSep 12, 2025
Cost-Free Personalization via Information-Geometric Projection in Bayesian Federated LearningNour Jamoussi, Giuseppe Serra, Photios A. Stavrou et al.
Bayesian Federated Learning (BFL) combines uncertainty modeling with decentralized training, enabling the development of personalized and reliable models under data heterogeneity and privacy constraints. Existing approaches typically rely on Markov Chain Monte Carlo (MCMC) sampling or variational inference, often incorporating personalization mechanisms to better adapt to local data distributions. In this work, we propose an information-geometric projection framework for personalization in parametric BFL. By projecting the global model onto a neighborhood of the user's local model, our method enables a tunable trade-off between global generalization and local specialization. Under mild assumptions, we show that this projection step is equivalent to computing a barycenter on the statistical manifold, allowing us to derive closed-form solutions and achieve cost-free personalization. We apply the proposed approach to a variational learning setup using the Improved Variational Online Newton (IVON) optimizer and extend its application to general aggregation schemes in BFL. Empirical evaluations under heterogeneous data distributions confirm that our method effectively balances global and local performance with minimal computational overhead.
ITJul 23, 2025
Information Entropy-Based Scheduling for Communication-Efficient Decentralized LearningJaiprakash Nagar, Zheng Chen, Marios Kountouris et al.
This paper addresses decentralized stochastic gradient descent (D-SGD) over resource-constrained networks by introducing node-based and link-based scheduling strategies to enhance communication efficiency. In each iteration of the D-SGD algorithm, only a few disjoint subsets of nodes or links are randomly activated, subject to a given communication cost constraint. We propose a novel importance metric based on information entropy to determine node and link scheduling probabilities. We validate the effectiveness of our approach through extensive simulations, comparing it against state-of-the-art methods, including betweenness centrality (BC) for node scheduling and \textit{MATCHA} for link scheduling. The results show that our method consistently outperforms the BC-based method in the node scheduling case, achieving faster convergence with up to 60\% lower communication budgets. At higher communication budgets (above 60\%), our method maintains comparable or superior performance. In the link scheduling case, our method delivers results that are superior to or on par with those of \textit{MATCHA}.
ITSep 17, 2021
Generalized Talagrand Inequality for Sinkhorn Distance using Entropy Power InequalityShuchan Wang, Photios A. Stavrou, Mikael Skoglund
In this paper, we study the connection between entropic optimal transport and entropy power inequality (EPI). First, we prove an HWI-type inequality making use of the infinitesimal displacement convexity of optimal transport map. Second, we derive two Talagrand-type inequalities using the saturation of EPI that corresponds to a numerical term in our expression. We evaluate for a wide variety of distributions this term whereas for Gaussian and i.i.d. Cauchy distributions this term is found in explicit form. We show that our results extend previous results of Gaussian Talagrand inequality for Sinkhorn distance to the strongly log-concave case.
ITSep 30, 2018
Zero-Delay Rate Distortion via Filtering for Vector-Valued Gaussian SourcesPhotios A. Stavrou, Jan Ostergaard, Charalambos D. Charalambous
We deal with zero-delay source coding of a vector-valued Gauss-Markov source subject to a mean-squared error (MSE) fidelity criterion characterized by the operational zero-delay vector-valued Gaussian rate distortion function (RDF). We address this problem by considering the nonanticipative RDF (NRDF) which is a lower bound to the causal optimal performance theoretically attainable (OPTA) function and operational zero-delay RDF. We recall the realization that corresponds to the optimal "test-channel" of the Gaussian NRDF, when considering a vector Gauss-Markov source subject to a MSE distortion in the finite time horizon. Then, we introduce sufficient conditions to show existence of solution for this problem in the infinite time horizon. For the asymptotic regime, we use the asymptotic characterization of the Gaussian NRDF to provide a new equivalent realization scheme with feedback which is characterized by a resource allocation (reverse-waterfilling) problem across the dimension of the vector source. We leverage the new realization to derive a predictive coding scheme via lattice quantization with subtractive dither and joint memoryless entropy coding. This coding scheme offers an upper bound to the operational zero-delay vector-valued Gaussian RDF. When we use scalar quantization, then for "r" active dimensions of the vector Gauss-Markov source the gap between the obtained lower and theoretical upper bounds is less than or equal to 0.254r + 1 bits/vector. We further show that it is possible when we use vector quantization, and assume infinite dimensional Gauss-Markov sources to make the previous gap to be negligible, i.e., Gaussian NRDF approximates the operational zero-delay Gaussian RDF. We also extend our results to vector-valued Gaussian sources of any finite memory under mild conditions. Our theoretical framework is demonstrated with illustrative numerical experiments.