If generative AI is the answer, what is the question?
This work provides a foundational perspective on generative AI for researchers and practitioners, but it is incremental as it synthesizes existing knowledge without presenting new empirical results.
The paper tackles the problem of defining generation as a distinct machine learning task by exploring its foundations, surveying major model families, and introducing probabilistic and game-theoretic frameworks, with a focus on task-first framing and socially responsible topics like privacy and copyright.
Beginning with text and images, generative AI has expanded to audio, video, computer code, and molecules. Yet, if generative AI is the answer, what is the question? We explore the foundations of generation as a distinct machine learning task with connections to prediction, compression, and decision-making. We survey five major generative model families: autoregressive models, variational autoencoders, normalizing flows, generative adversarial networks, and diffusion models. We then introduce a probabilistic framework that emphasizes the distinction between density estimation and generation. We review a game-theoretic framework with a two-player adversary-learner setup to study generation. We discuss post-training modifications that prepare generative models for deployment. We end by highlighting some important topics in socially responsible generation such as privacy, detection of AI-generated content, and copyright and IP. We adopt a task-first framing of generation, focusing on what generation is as a machine learning problem, rather than only on how models implement it.