CVLGApr 12, 2023

ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation

Tsinghua
arXiv:2304.05977v41057 citationsh-index: 47Has Code
Originality Highly original
AI Analysis

This work addresses the challenge of evaluating and improving text-to-image models for better alignment with human preferences, representing a significant but incremental advance in the field.

The authors tackled the problem of aligning text-to-image generation with human preferences by introducing ImageReward, a reward model trained on 137k expert comparisons, which outperforms existing metrics in human evaluation, and ReFL, a tuning algorithm that improves diffusion models based on this scorer.

We present a comprehensive solution to learn and improve text-to-image models from human preference feedback. To begin with, we build ImageReward -- the first general-purpose text-to-image human preference reward model -- to effectively encode human preferences. Its training is based on our systematic annotation pipeline including rating and ranking, which collects 137k expert comparisons to date. In human evaluation, ImageReward outperforms existing scoring models and metrics, making it a promising automatic metric for evaluating text-to-image synthesis. On top of it, we propose Reward Feedback Learning (ReFL), a direct tuning algorithm to optimize diffusion models against a scorer. Both automatic and human evaluation support ReFL's advantages over compared methods. All code and datasets are provided at \url{https://github.com/THUDM/ImageReward}.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes