LGAICVDec 18, 2024

A Unifying Information-theoretic Perspective on Evaluating Generative Models

arXiv:2412.14340v31 citationsh-index: 22AAAI
Originality Incremental advance
AI Analysis

This work addresses the problem of inconsistent and hard-to-interpret evaluation metrics for generative models, which is crucial for researchers and practitioners in machine learning, though it is incremental as it builds on existing kNN-based approaches.

The paper tackles the challenge of evaluating generative models by proposing a unifying information-theoretic perspective and a new tri-dimensional metric (PCE, RCE, RE) to separately measure fidelity and diversity, with experimental results showing its sensitivity and revealing issues in other metrics.

Considering the difficulty of interpreting generative model output, there is significant current research focused on determining meaningful evaluation metrics. Several recent approaches utilize "precision" and "recall," borrowed from the classification domain, to individually quantify the output fidelity (realism) and output diversity (representation of the real data variation), respectively. With the increase in metric proposals, there is a need for a unifying perspective, allowing for easier comparison and clearer explanation of their benefits and drawbacks. To this end, we unify a class of kth-nearest-neighbors (kNN)-based metrics under an information-theoretic lens using approaches from kNN density estimation. Additionally, we propose a tri-dimensional metric composed of Precision Cross-Entropy (PCE), Recall Cross-Entropy (RCE), and Recall Entropy (RE), which separately measure fidelity and two distinct aspects of diversity, inter- and intra-class. Our domain-agnostic metric, derived from the information-theoretic concepts of entropy and cross-entropy, can be dissected for both sample- and mode-level analysis. Our detailed experimental results demonstrate the sensitivity of our metric components to their respective qualities and reveal undesirable behaviors of other metrics.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes