Gowtham R. Kurri

LG
h-index7
7papers
54citations
Novelty50%
AI Score47

7 Papers

LGOct 27, 2023
Addressing GAN Training Instabilities via Tunable Classification Losses

Monica Welfert, Gowtham R. Kurri, Kyle Otstot et al.

Generative adversarial networks (GANs), modeled as a zero-sum game between a generator (G) and a discriminator (D), allow generating synthetic data with formal guarantees. Noting that D is a classifier, we begin by reformulating the GAN value function using class probability estimation (CPE) losses. We prove a two-way correspondence between CPE loss GANs and $f$-GANs which minimize $f$-divergences. We also show that all symmetric $f$-divergences are equivalent in convergence. In the finite sample and model capacity setting, we define and obtain bounds on estimation and generalization errors. We specialize these results to $α$-GANs, defined using $α$-loss, a tunable CPE loss family parametrized by $α\in(0,\infty]$. We next introduce a class of dual-objective GANs to address training instabilities of GANs by modeling each player's objective using $α$-loss to obtain $(α_D,α_G)$-GANs. We show that the resulting non-zero sum game simplifies to minimizing an $f$-divergence under appropriate conditions on $(α_D,α_G)$. Generalizing this dual-objective formulation using CPE losses, we define and obtain upper bounds on an appropriately defined estimation error. Finally, we highlight the value of tuning $(α_D,α_G)$ in alleviating training instabilities for the synthetic 2D Gaussian mixture ring as well as the large publicly available Celeb-A and LSUN Classroom image datasets.

LGMay 12, 2022
$α$-GAN: Convergence and Estimation Guarantees

Gowtham R. Kurri, Monica Welfert, Tyler Sypherd et al.

We prove a two-way correspondence between the min-max optimization of general CPE loss function GANs and the minimization of associated $f$-divergences. We then focus on $α$-GAN, defined via the $α$-loss, which interpolates several GANs (Hellinger, vanilla, Total Variation) and corresponds to the minimization of the Arimoto divergence. We show that the Arimoto divergences induced by $α$-GAN equivalently converge, for all $α\in \mathbb{R}_{>0}\cup\{\infty\}$. However, under restricted learning models and finite samples, we provide estimation bounds which indicate diverse GAN behavior as a function of $α$. Finally, we present empirical results on a toy dataset that highlight the practical utility of tuning the $α$ hyperparameter.

69.5ITMay 28
Secure Distributed Hypothesis Testing

Gowtham R. Kurri, Varun Narayanan, Vinod M. Prabhakaran et al.

In distributed hypothesis testing, a central server performs hypothesis testing based on information received from distributed sensors/clients. We study a secure variant of this problem in which the central server determines the hypothesis class of an underlying distribution without learning any additional information about the distribution itself. We prove that, in its standard form, this is impossible to achieve, even for simple and highly restricted cases. To bypass this impossibility, we augment the model with a shared secret key available to clients but hidden from the server. We show that a single-bit secret key enables perfectly secure testing for simple classes by reducing the test distributions to a symmetric, canonical instance. Finally, for arbitrary hypothesis classes over finite domains, we establish a reduction to standard hypothesis testing using Private Simultaneous Messages (PSM) protocols, achieving polynomial communication and key lengths.

LGFeb 28, 2023
$(α_D,α_G)$-GANs: Addressing GAN Training Instabilities via Dual Objectives

Monica Welfert, Kyle Otstot, Gowtham R. Kurri et al.

In an effort to address the training instabilities of GANs, we introduce a class of dual-objective GANs with different value functions (objectives) for the generator (G) and discriminator (D). In particular, we model each objective using $α$-loss, a tunable classification loss, to obtain $(α_D,α_G)$-GANs, parameterized by $(α_D,α_G)\in (0,\infty]^2$. For sufficiently large number of samples and capacities for G and D, we show that the resulting non-zero sum game simplifies to minimizing an $f$-divergence under appropriate conditions on $(α_D,α_G)$. In the finite sample and capacity setting, we define estimation error to quantify the gap in the generator's performance relative to the optimal setting with infinite samples and obtain upper bounds on this error, showing it to be order optimal under certain conditions. Finally, we highlight the value of tuning $(α_D,α_G)$ in alleviating training instabilities for the synthetic 2D Gaussian mixture ring and the Stacked MNIST datasets.

8.1ITMay 12
From Submodularity to Matrix Determinants: Strengthening Han's, Szász's, and Fischer's Inequalities

Gunank Jakhar, Gowtham R. Kurri, Suryajith Chillara et al.

Dembo, Cover, and Thomas (1991) developed an elegant information-theoretic framework for proving determinantal inequalities for positive definite matrices, which relies on the structural inequalities of differential entropy. Submodular functions, which subsume entropy, inherently satisfy these structural inequalities because they obey generalized forms of the fundamental properties of entropy -- a chain rule and the property that conditioning reduces the function's value (under an appropriate definition of conditioning). Applying subadditivity, Han's inequality (1978), and partition subadditivity (i.e., subadditivity over a partition) yields Hadamard's, Szász's, and Fischer's inequalities, respectively. Furthermore, this framework recovers Ky Fan's inequality (1955), a strengthening of Hadamard's inequality. This improvement fundamentally arises because conditional subadditivity yields a tighter upper bound on the joint entropy than the one obtained via unconditional subadditivity. In this paper, we establish conditional strengthenings of Han's inequality and partition subadditivity in the general setting of submodular functions. We derive equality conditions for these strengthened bounds and characterize when they strictly improve their unconditional counterparts. We specialize these results to differential entropy and apply them to establish strengthened versions of Szász's and Fischer's inequalities. The strengthening of Szász's inequality recovers Ky Fan's inequality as a special case, and is strictly stronger than the classical Szász's inequality for any non-diagonal positive definite matrix. We also derive an inequality concerning eigenvalues, which generalizes and strictly strengthens a corresponding eigenvalue inequality of Ky Fan. We provide numerical examples to explicitly illustrate the tightness of our proposed matrix determinantal bounds.

LGJul 23, 2025
Generalized Dual Discriminator GANs

Penukonda Naga Chandana, Tejas Srivastava, Gowtham R. Kurri et al.

Dual discriminator generative adversarial networks (D2 GANs) were introduced to mitigate the problem of mode collapse in generative adversarial networks. In D2 GANs, two discriminators are employed alongside a generator: one discriminator rewards high scores for samples from the true data distribution, while the other favors samples from the generator. In this work, we first introduce dual discriminator $α$-GANs (D2 $α$-GANs), which combines the strengths of dual discriminators with the flexibility of a tunable loss function, $α$-loss. We further generalize this approach to arbitrary functions defined on positive reals, leading to a broader class of models we refer to as generalized dual discriminator generative adversarial networks. For each of these proposed models, we provide theoretical analysis and show that the associated min-max optimization reduces to the minimization of a linear combination of an $f$-divergence and a reverse $f$-divergence. This generalizes the known simplification for D2-GANs, where the objective reduces to a linear combination of the KL-divergence and the reverse KL-divergence. Finally, we perform experiments on 2D synthetic data and use multiple performance metrics to capture various advantages of our GANs.

LGJun 9, 2021
Realizing GANs via a Tunable Loss Function

Gowtham R. Kurri, Tyler Sypherd, Lalitha Sankar

We introduce a tunable GAN, called $α$-GAN, parameterized by $α\in (0,\infty]$, which interpolates between various $f$-GANs and Integral Probability Metric based GANs (under constrained discriminator set). We construct $α$-GAN using a supervised loss function, namely, $α$-loss, which is a tunable loss function capturing several canonical losses. We show that $α$-GAN is intimately related to the Arimoto divergence, which was first proposed by Österriecher (1996), and later studied by Liese and Vajda (2006). We also study the convergence properties of $α$-GAN. We posit that the holistic understanding that $α$-GAN introduces will have practical benefits of addressing both the issues of vanishing gradients and mode collapse.