MagicWand: A Universal Agent for Generation and Evaluation Aligned with User Preference
This addresses the problem of preference alignment in AI-generated content for users, though it appears incremental as it builds on existing AIGC models with new datasets and methods.
The paper tackles the challenge of users struggling to obtain AI-generated content that aligns with their preferences due to difficulties in prompt crafting and lack of preference retention mechanisms, resulting in MagicWand, a universal agent that consistently generates and evaluates content aligned with user preferences across diverse scenarios as demonstrated on the UniPreferBench benchmark with over 120K annotations.
Recent advances in AIGC (Artificial Intelligence Generated Content) models have enabled significant progress in image and video generation. However, users still struggle to obtain content that aligns with their preferences due to the difficulty of crafting detailed prompts and the lack of mechanisms to retain their preferences. To address these challenges, we construct \textbf{UniPrefer-100K}, a large-scale dataset comprising images, videos, and associated text that describes the styles users tend to prefer. Based on UniPrefer-100K, we propose \textbf{MagicWand}, a universal generation and evaluation agent that enhances prompts based on user preferences, leverages advanced generation models for high-quality content, and applies preference-aligned evaluation and refinement. In addition, we introduce \textbf{UniPreferBench}, the first large-scale benchmark with over 120K annotations for assessing user preference alignment across diverse AIGC tasks. Experiments on UniPreferBench demonstrate that MagicWand consistently generates content and evaluations that are well aligned with user preferences across a wide range of scenarios.