RewardSDS: Aligning Score Distillation via Reward-Weighted Sampling
This addresses alignment issues in diffusion-based generation methods for users in fields like 3D modeling and image editing, representing an incremental advancement.
The paper tackles the problem of fine-grained alignment in Score Distillation Sampling (SDS) for tasks like text-to-3D generation by introducing RewardSDS, which weights noise samples based on reward model scores, and shows significant improvements in generation quality and alignment metrics over SDS and VSD.
Score Distillation Sampling (SDS) has emerged as an effective technique for leveraging 2D diffusion priors for tasks such as text-to-3D generation. While powerful, SDS struggles with achieving fine-grained alignment to user intent. To overcome this, we introduce RewardSDS, a novel approach that weights noise samples based on alignment scores from a reward model, producing a weighted SDS loss. This loss prioritizes gradients from noise samples that yield aligned high-reward output. Our approach is broadly applicable and can extend SDS-based methods. In particular, we demonstrate its applicability to Variational Score Distillation (VSD) by introducing RewardVSD. We evaluate RewardSDS and RewardVSD on text-to-image, 2D editing, and text-to-3D generation tasks, showing significant improvements over SDS and VSD on a diverse set of metrics measuring generation quality and alignment to desired reward models, enabling state-of-the-art performance. Project page is available at https://itaychachy.github.io/reward-sds/.