CVOct 2, 2022

Generated Faces in the Wild: Quantitative Comparison of Stable Diffusion, Midjourney and DALL-E 2

arXiv:2210.00586v2180 citationsh-index: 54
Originality Synthesis-oriented
AI Analysis

This study addresses the need for fine-grained evaluation of image synthesis models on faces, providing a benchmark for researchers and practitioners in generative AI.

The paper quantitatively compares Stable Diffusion, Midjourney, and DALL-E 2 for generating photorealistic faces, finding Stable Diffusion achieves the best performance with a lower FID score, and introduces a dataset of 15,076 generated faces.

The field of image synthesis has made great strides in the last couple of years. Recent models are capable of generating images with astonishing quality. Fine-grained evaluation of these models on some interesting categories such as faces is still missing. Here, we conduct a quantitative comparison of three popular systems including Stable Diffusion, Midjourney, and DALL-E 2 in their ability to generate photorealistic faces in the wild. We find that Stable Diffusion generates better faces than the other systems, according to the FID score. We also introduce a dataset of generated faces in the wild dubbed GFW, including a total of 15,076 faces. Furthermore, we hope that our study spurs follow-up research in assessing the generative models and improving them. Data and code are available at data and code, respectively.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes