GANji: A Framework for Introductory AI Image Generation
This provides an accessible tool for researchers and practitioners to understand trade-offs in generative models, though it is incremental as it applies existing methods to a new dataset.
The paper tackled the problem of high computational resource barriers in comparing generative models by introducing GANji, a lightweight framework for benchmarking AI image generation techniques on a dataset of 10,314 Japanese Kanji characters, finding that DDPM achieved the highest fidelity with an FID score of 26.2 but was over 2,000 times slower in sampling than other models.
The comparative study of generative models often requires significant computational resources, creating a barrier for researchers and practitioners. This paper introduces GANji, a lightweight framework for benchmarking foundational AI image generation techniques using a dataset of 10,314 Japanese Kanji characters. It systematically compares the performance of a Variational Autoencoder (VAE), a Generative Adversarial Network (GAN), and a Denoising Diffusion Probabilistic Model (DDPM). The results demonstrate that while the DDPM achieves the highest image fidelity, with a Fréchet Inception Distance (FID) score of 26.2, its sampling time is over 2,000 times slower than the other models. The GANji framework is an effective and accessible tool for revealing the fundamental trade-offs between model architecture, computational cost, and visual quality, making it ideal for both educational and research purposes.