CVDec 24, 2024

Dense-Face: Personalized Face Generation Model via Dense Annotation Prediction

Xiao Guo, Manh Tran, Jiaxin Cheng, Xiaoming Liu

arXiv:2412.18149v16.54 citationsh-index: 5

Originality Incremental advance

AI Analysis

This addresses the challenge of generating personalized face images with consistent identity and text alignment for users in AI-driven content creation, though it appears incremental as it builds on stable diffusion with adapters and annotations.

The paper tackles the problem of text-to-image personalization for face generation, where existing methods require test-time fine-tuning or have poor text alignment, and proposes Dense-Face, which achieves state-of-the-art or competitive performance in image-text alignment, identity preservation, and pose control.

The text-to-image (T2I) personalization diffusion model can generate images of the novel concept based on the user input text caption. However, existing T2I personalized methods either require test-time fine-tuning or fail to generate images that align well with the given text caption. In this work, we propose a new T2I personalization diffusion model, Dense-Face, which can generate face images with a consistent identity as the given reference subject and align well with the text caption. Specifically, we introduce a pose-controllable adapter for the high-fidelity image generation while maintaining the text-based editing ability of the pre-trained stable diffusion (SD). Additionally, we use internal features of the SD UNet to predict dense face annotations, enabling the proposed method to gain domain knowledge in face generation. Empirically, our method achieves state-of-the-art or competitive generation performance in image-text alignment, identity preservation, and pose control.

View on arXiv PDF

Similar