LGMLMar 7, 2024

What makes an image realistic?

arXiv:2403.04493v410 citationsh-index: 1ICML
AI Analysis

This addresses a fundamental challenge in machine learning for evaluating generative AI, but it is incremental as it provides theoretical insights rather than a ready-to-use solution.

The paper tackles the problem of quantifying realism in generated data, such as images, by introducing the concept of a universal critic that does not require adversarial training, though it is not immediately practical.

The last decade has seen tremendous progress in our ability to generate realistic-looking data, be it images, text, audio, or video. Here, we discuss the closely related problem of quantifying realism, that is, designing functions that can reliably tell realistic data from unrealistic data. This problem turns out to be significantly harder to solve and remains poorly understood, despite its prevalence in machine learning and recent breakthroughs in generative AI. Drawing on insights from algorithmic information theory, we discuss why this problem is challenging, why a good generative model alone is insufficient to solve it, and what a good solution would look like. In particular, we introduce the notion of a universal critic, which unlike adversarial critics does not require adversarial training. While universal critics are not immediately practical, they can serve both as a North Star for guiding practical implementations and as a tool for analyzing existing attempts to capture realism.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes