ImageGem: In-the-wild Generative Image Interaction Dataset for Generative Model Personalization
This addresses the problem of personalizing generative models for users by providing a large-scale dataset, though it is incremental as it builds on existing methods like LoRAs and diffusion models.
The paper tackles the lack of in-the-wild and fine-grained user preference data for generative model personalization by introducing ImageGem, a dataset with 57K users, 242K customized LoRAs, 3M text prompts, and 5M generated images, which enabled training better preference alignment models and demonstrated a new paradigm for personalization.
We introduce ImageGem, a dataset for studying generative models that understand fine-grained individual preferences. We posit that a key challenge hindering the development of such a generative model is the lack of in-the-wild and fine-grained user preference annotations. Our dataset features real-world interaction data from 57K users, who collectively have built 242K customized LoRAs, written 3M text prompts, and created 5M generated images. With user preference annotations from our dataset, we were able to train better preference alignment models. In addition, leveraging individual user preference, we investigated the performance of retrieval models and a vision-language model on personalized image retrieval and generative model recommendation. Finally, we propose an end-to-end framework for editing customized diffusion models in a latent weight space to align with individual user preferences. Our results demonstrate that the ImageGem dataset enables, for the first time, a new paradigm for generative model personalization.