Amin Gohari

h-index23

7papers

93citations

Novelty55%

AI Score50

Ranked #21,995 of 194,257 authors (top 11%)#47 in IT (top 6%)

7 Papers

19.8MLMar 4, 2022

Rate-Distortion Theoretic Generalization Bounds for Stochastic Learning Algorithms

Milad Sefidgaran, Amin Gohari, Gaël Richard et al.

Understanding generalization in modern machine learning settings has been one of the major challenges in statistical learning theory. In this context, recent years have witnessed the development of various generalization bounds suggesting different complexity notions such as the mutual information between the data sample and the algorithm output, compressibility of the hypothesis space, and the fractal dimension of the hypothesis space. While these bounds have illuminated the problem at hand from different angles, their suggested complexity notions might appear seemingly unrelated, thereby restricting their high-level impact. In this study, we prove novel generalization bounds through the lens of rate-distortion theory, and explicitly relate the concepts of mutual information, compressibility, and fractal dimensions in a single mathematical framework. Our approach consists of (i) defining a generalized notion of compressibility by using source coding concepts, and (ii) showing that the `compression error rate' can be linked to the generalization error both in expectation and with high probability. We show that in the `lossless compression' setting, we recover and improve existing mutual information-based bounds, whereas a `lossy compression' scheme allows us to link generalization to the rate-distortion dimension -- a particular notion of fractal dimension. Our results bring a more unified perspective on generalization and open up several future research directions.

6.6ITJun 21, 2022

f-divergences and their applications in lossy compression and bounding generalization error

Saeed Masiha, Amin Gohari, Mohammad Hossein Yassaee

In this paper, we provide three applications for $f$-divergences: (i) we introduce Sanov's upper bound on the tail probability of the sum of independent random variables based on super-modular $f$-divergence and show that our generalized Sanov's bound strictly improves over ordinary one, (ii) we consider the lossy compression problem which studies the set of achievable rates for a given distortion and code length. We extend the rate-distortion function using mutual $f$-information and provide new and strictly better bounds on achievable rates in the finite blocklength regime using super-modular $f$-divergences, and (iii) we provide a connection between the generalization error of algorithms with bounded input/output mutual $f$-information and a generalized rate-distortion problem. This connection allows us to bound the generalization error of learning algorithms using lower bounds on the $f$-rate-distortion function. Our bound is based on a new lower bound on the rate-distortion function that (for some examples) strictly improves over previously best-known bounds.

7.5ITApr 12

Dependence Balance and Capacity Bounds for Multiterminal Communication and Wiretap Channels

Amin Gohari, Gerhard Kramer

An information measure based on fractional partitions of a set is used to derive a general dependence balance inequality for communication. This inequality is used to obtain new upper bounds on reliable and secret rates for multiterminal channels. For example, we obtain a new upper bound on the rate of shared randomness generated among terminals, a counterpart of the cut-set bound for reliable communication. The bounds for reliable communication use the concept of auxiliary receivers, and we show that they are optimized by Gaussian distributions for Gaussian channels. The bounds are applied to multiaccess channels with generalized feedback and relay channels, and improve the cut-set bound for scalar Gaussian channels. The improvement for Gaussian relay channels complements results obtained with other methods.

6.6ITMar 31

The capacity region of classes of product broadcast channels

Yanlin Geng, Amin Gohari, Chandra Nair et al.

We establish a new outer bound for the capacity region of product broadcast channels. This outer bound matches Marton's inner bound for a variety of classes of product broadcast channels whose capacity regions were previously unknown. These classes include product of reversely semi-deterministic and product of reversely more-capable channels. A significant consequence of this new outer bound is that it establishes, via an example, that the previously best known outer-bound is strictly suboptimal for the general broadcast channel. Our example is comprised of a product broadcast channel with two semi-deterministic components in reverse orientation.

8.0ITMar 26

List Estimation

Nikola Zlatanov, Amin Gohari, Farzad Shahrivari et al.

Classical estimation outputs a single point estimate of an unknown $d$-dimensional vector from an observation. In this paper, we study \emph{$k$-list estimation}, in which a single observation is used to produce a list of $k$ candidate estimates and performance is measured by the expected squared distance from the true vector to the closest candidate. We compare this centralized setting with a symmetric decentralized MMSE benchmark in which $k$ agents observe conditionally i.i.d.\ measurements and each agent outputs its own MMSE estimate. On the centralized side, we show that optimal $k$-list estimation is equivalent to fixed-rate $k$-point vector quantization of the posterior distribution and, under standard regularity conditions, admits an exact high-rate asymptotic expansion with explicit constants and decay rate $k^{-2/d}$. On the decentralized side, we derive lower bounds in terms of the small-ball behavior of the single-agent MMSE error; in particular, when the conditional error density is bounded near the origin, the benchmark distortion cannot decay faster than order $k^{-2/d}$. We further show that if the error density vanishes at the origin, then the decentralized benchmark is provably unable to match the centralized $k^{-2/d}$ exponent, whereas the centralized estimator retains that scaling. Gaussian specializations yield explicit formulas and numerical experiments corroborate the predicted asymptotic behavior. Overall, the results show that, in the scaling with $k$, one observation combined with $k$ carefully chosen candidates can be asymptotically as effective as -- and in some regimes strictly better than -- this MMSE-based decentralized benchmark with $k$ independent observations.

13.4LGNov 5, 2024Code

Conditional Vendi Score: An Information-Theoretic Approach to Diversity Evaluation of Prompt-based Generative Models

Mohammad Jalali, Azim Ospanov, Amin Gohari et al.

Text-conditioned generation models are commonly evaluated based on the quality of the generated data and its alignment with the input text prompt. On the other hand, several applications of prompt-based generative models require sufficient diversity in the generated data to ensure the models' capability of generating image and video samples possessing a variety of features. However, most existing diversity metrics are designed for unconditional generative models, and thus cannot distinguish the diversity arising from variations in text prompts and that contributed by the generative model itself. In this work, our goal is to quantify the prompt-induced and model-induced diversity in samples generated by prompt-based models. We propose an information-theoretic approach for internal diversity quantification, where we decompose the kernel-based entropy $H(X)$ of the generated data $X$ into the sum of the conditional entropy $H(X|T)$, given text variable $T$, and the mutual information $I(X; T)$ between the text and data variables. We introduce the \emph{Conditional-Vendi} score based on $H(X|T)$ to quantify the internal diversity of the model and the \emph{Information-Vendi} score based on $I(X; T)$ to measure the statistical relevance between the generated data and text prompts. We provide theoretical results to statistically interpret these scores and relate them to the unconditional Vendi score. We conduct several numerical experiments to show the correlation between the Conditional-Vendi score and the internal diversity of text-conditioned generative models. The codebase is available at \href{https://github.com/mjalali/conditional-vendi}{https://github.com/mjalali/conditional-vendi}.

6.4LGFeb 28, 2024Code

On the Inductive Biases of Demographic Parity-based Fair Learning Algorithms

Haoyu Lei, Amin Gohari, Farzan Farnia

Fair supervised learning algorithms assigning labels with little dependence on a sensitive attribute have attracted great attention in the machine learning community. While the demographic parity (DP) notion has been frequently used to measure a model's fairness in training fair classifiers, several studies in the literature suggest potential impacts of enforcing DP in fair learning algorithms. In this work, we analytically study the effect of standard DP-based regularization methods on the conditional distribution of the predicted label given the sensitive attribute. Our analysis shows that an imbalanced training dataset with a non-uniform distribution of the sensitive attribute could lead to a classification rule biased toward the sensitive attribute outcome holding the majority of training data. To control such inductive biases in DP-based fair learning, we propose a sensitive attribute-based distributionally robust optimization (SA-DRO) method improving robustness against the marginal distribution of the sensitive attribute. Finally, we present several numerical results on the application of DP-based learning methods to standard centralized and distributed learning problems. The empirical findings support our theoretical results on the inductive biases in DP-based fair learning algorithms and the debiasing effects of the proposed SA-DRO method.