CVAIHCNov 10, 2018

Use of Neural Signals to Evaluate the Quality of Generative Adversarial Network Performance in Facial Image Generation

arXiv:1811.04172v341 citations
Originality Highly original
AI Analysis

This addresses the challenge of inconsistent GAN evaluation metrics for researchers and developers in computer vision and AI, offering a more human-aligned assessment method, though it is incremental in applying neural signals to an existing evaluation bottleneck.

The paper tackles the problem of evaluating GAN performance in facial image generation by proposing a novel Neuroscore based on neural signals from a brain-computer interface, which correlates strongly with human perceptual judgments (r = -0.767, p = 2.089e-10) and outperforms conventional metrics.

There is a growing interest in using generative adversarial networks (GANs) to produce image content that is indistinguishable from real images as judged by a typical person. A number of GAN variants for this purpose have been proposed, however, evaluating GANs performance is inherently difficult because current methods for measuring the quality of their output are not always consistent with what a human perceives. We propose a novel approach that combines a brain-computer interface (BCI) with GANs to generate a measure we call Neuroscore, which closely mirrors the behavioral ground truth measured from participants tasked with discerning real from synthetic images. This technique we call a neuro-AI interface, as it provides an interface between a human's neural systems and an AI process. In this paper, we first compare the three most widely used metrics in the literature for evaluating GANs in terms of visual quality and compare their outputs with human judgments. Secondly we propose and demonstrate a novel approach using neural signals and rapid serial visual presentation (RSVP) that directly measures a human perceptual response to facial production quality, independent of a behavioral response measurement. The correlation between our proposed Neuroscore and human perceptual judgments has Pearson correlation statistics: $\mathrm{r}(48) = -0.767, \mathrm{p} = 2.089e-10$. We also present the bootstrap result for the correlation i.e., $\mathrm{p}\leq 0.0001$. Results show that our Neuroscore is more consistent with human judgment compared to the conventional metrics we evaluated. We conclude that neural signals have potential applications for high quality, rapid evaluation of GANs in the context of visual image synthesis.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes