CVAIMar 25, 2023

Human Preference Score: Better Aligning Text-to-Image Models with Human Preference

arXiv:2303.14420v2336 citationsh-index: 82
Originality Incremental advance
AI Analysis

This addresses the issue of misalignment with human preferences in text-to-image generation for users and developers, but it is incremental as it adapts an existing model with a new scoring method.

The authors tackled the problem of text-to-image models generating images misaligned with human preferences, such as awkward limbs and facial expressions, by collecting a human choice dataset and training a Human Preference Score (HPS) classifier; they showed that HPS outperforms CLIP in predicting human choices and tuning Stable Diffusion with HPS guidance produces images more preferred by users.

Recent years have witnessed a rapid growth of deep generative models, with text-to-image models gaining significant attention from the public. However, existing models often generate images that do not align well with human preferences, such as awkward combinations of limbs and facial expressions. To address this issue, we collect a dataset of human choices on generated images from the Stable Foundation Discord channel. Our experiments demonstrate that current evaluation metrics for generative models do not correlate well with human choices. Thus, we train a human preference classifier with the collected dataset and derive a Human Preference Score (HPS) based on the classifier. Using HPS, we propose a simple yet effective method to adapt Stable Diffusion to better align with human preferences. Our experiments show that HPS outperforms CLIP in predicting human choices and has good generalization capability toward images generated from other models. By tuning Stable Diffusion with the guidance of HPS, the adapted model is able to generate images that are more preferred by human users. The project page is available here: https://tgxs002.github.io/align_sd_web/ .

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes