LGJul 19, 2023
Adversarial Likelihood Estimation With One-Way FlowsOmri Ben-Dov, Pravir Singh Gupta, Victoria Abrevaya et al.
Generative Adversarial Networks (GANs) can produce high-quality samples, but do not provide an estimate of the probability density around the samples. However, it has been noted that maximizing the log-likelihood within an energy-based setting can lead to an adversarial framework where the discriminator provides unnormalized density (often called energy). We further develop this perspective, incorporate importance sampling, and show that 1) Wasserstein GAN performs a biased estimate of the partition function, and we propose instead to use an unbiased estimator; and 2) when optimizing for likelihood, one must maximize generator entropy. This is hypothesized to provide a better mode coverage. Different from previous works, we explicitly compute the density of the generated samples. This is the key enabler to designing an unbiased estimator of the partition function and computation of the generator entropy term. The generator density is obtained via a new type of flow network, called one-way flow network, that is less constrained in terms of architecture, as it does not require a tractable inverse function. Our experimental results show that our method converges faster, produces comparable sample quality to GANs with similar architecture, successfully avoids over-fitting to commonly used datasets and produces smooth low-dimensional latent representations of the training data.
LGJun 23, 2022
LED: Latent Variable-based Estimation of DensityOmri Ben-Dov, Pravir Singh Gupta, Victoria Fernandez Abrevaya et al.
Modern generative models are roughly divided into two main categories: (1) models that can produce high-quality random samples, but cannot estimate the exact density of new data points and (2) those that provide exact density estimation, at the expense of sample quality and compactness of the latent space. In this work we propose LED, a new generative model closely related to GANs, that allows not only efficient sampling but also efficient density estimation. By maximizing log-likelihood on the output of the discriminator, we arrive at an alternative adversarial optimization objective that encourages generated data diversity. This formulation provides insights into the relationships between several popular generative models. Additionally, we construct a flow-based generator that can compute exact probabilities for generated samples, while allowing low-dimensional latent variables as input. Our experimental results, on various datasets, show that our density estimator produces accurate estimates, while retaining good quality in the generated samples.
CVAug 31, 2020
GIF: Generative Interpretable FacesPartha Ghosh, Pravir Singh Gupta, Roy Uziel et al.
Photo-realistic visualization and animation of expressive human faces have been a long standing challenge. 3D face modeling methods provide parametric control but generates unrealistic images, on the other hand, generative 2D models like GANs (Generative Adversarial Networks) output photo-realistic face images, but lack explicit control. Recent methods gain partial control, either by attempting to disentangle different factors in an unsupervised manner, or by adding control post hoc to a pre-trained model. Unconditional GANs, however, may entangle factors that are hard to undo later. We condition our generative model on pre-defined control parameters to encourage disentanglement in the generation process. Specifically, we condition StyleGAN2 on FLAME, a generative 3D face model. While conditioning on FLAME parameters yields unsatisfactory results, we find that conditioning on rendered FLAME geometry and photometric details works well. This gives us a generative 2D face model named GIF (Generative Interpretable Faces) that offers FLAME's parametric control. Here, interpretable refers to the semantic meaning of different parameters. Given FLAME parameters for shape, pose, expressions, parameters for appearance, lighting, and an additional style vector, GIF outputs photo-realistic face images. We perform an AMT based perceptual study to quantitatively and qualitatively evaluate how well GIF follows its conditioning. The code, data, and trained model are publicly available for research purposes at http://gif.is.tue.mpg.de.
IVSep 23, 2019
DRCAS: Deep Restoration Network for Hardware Based Compressive Acquisition SchemePravir Singh Gupta, Xin Yuan, Gwan Seong Choi
We investigate the power and performance improvement in image acquisition devices by the use of CAS (Compressed Acquisition Scheme) and DNN (Deep Neural Networks). Towards this end, we propose a novel image acquisition scheme HCAS (Hardware based Compressed Acquisition Scheme) using hardware-based binning (downsampling), bit truncation and JPEG compression and develop a deep learning based reconstruction network for images acquired using the same. HCAS is motivated by the fact that in-situ compression of raw data using binning and bit truncation results in reduction in data traffic and power in the entire downstream image processing pipeline and additional compression of processed data using JPEG will help in storage/transmission of images. The combination of in-situ compression with JPEG leads to high compression ratios, significant power savings with further advantages of image acquisition simplification. Bearing these concerns in mind, we propose DRCAS (Deep Restoration network for hardware based Compressed Acquisition Scheme), which to our best knowledge, is the first work proposed in the literature for restoration of images acquired using acquisition scheme like HCAS. When compared with the CAS methods (bicubic downsampling) used in super resolution tasks in literature, HCAS proposed in this paper performs superior in terms of both compression ratio and being hardware friendly. The restoration network DRCAS also perform superior than state-of-the-art super resolution networks while being much smaller. Thus HCAS and DRCAS technique will enable us to design much simpler and power efficient image acquisition pipelines.