A Survey on Personalized Content Synthesis with Diffusion Models
It addresses the need for an up-to-date summary of PCS for researchers and practitioners, as existing surveys are limited, but it is incremental as it synthesizes existing work without introducing new methods.
This paper provides a comprehensive survey of Personalized Content Synthesis (PCS) with diffusion models, analyzing over 150 methods to categorize frameworks, strengths, limitations, and specialized tasks, while highlighting challenges like overfitting and proposing future directions.
Recent advancements in diffusion models have significantly impacted content creation, leading to the emergence of Personalized Content Synthesis (PCS). By utilizing a small set of user-provided examples featuring the same subject, PCS aims to tailor this subject to specific user-defined prompts. Over the past two years, more than 150 methods have been introduced in this area. However, existing surveys primarily focus on text-to-image generation, with few providing up-to-date summaries on PCS. This paper provides a comprehensive survey of PCS, introducing the general frameworks of PCS research, which can be categorized into test-time fine-tuning (TTF) and pre-trained adaptation (PTA) approaches. We analyze the strengths, limitations, and key techniques of these methodologies. Additionally, we explore specialized tasks within the field, such as object, face, and style personalization, while highlighting their unique challenges and innovations. Despite the promising progress, we also discuss ongoing challenges, including overfitting and the trade-off between subject fidelity and text alignment. Through this detailed overview and analysis, we propose future directions to further the development of PCS.