CVFeb 23

ConceptPrism: Concept Disentanglement in Personalized Diffusion Models via Residual Token Optimization

arXiv:2602.19575v1h-index: 3
Originality Incremental advance
AI Analysis

This addresses concept entanglement for users of personalized diffusion models, offering an automated solution without manual guidance, though it is incremental as it builds on existing disentanglement approaches.

The paper tackles concept entanglement in personalized text-to-image generation, where irrelevant residual information from reference images compromises concept fidelity and text alignment, and proposes ConceptPrism to automatically disentangle shared visual concepts from image-specific residuals, achieving a significantly improved trade-off between fidelity and alignment.

Personalized text-to-image generation suffers from concept entanglement, where irrelevant residual information from reference images is captured, leading to a trade-off between concept fidelity and text alignment. Recent disentanglement approaches attempt to solve this utilizing manual guidance, such as linguistic cues or segmentation masks, which limits their applicability and fails to fully articulate the target concept. In this paper, we propose ConceptPrism, a novel framework that automatically disentangles the shared visual concept from image-specific residuals by comparing images within a set. Our method jointly optimizes a target token and image-wise residual tokens using two complementary objectives: a reconstruction loss to ensure fidelity, and a novel exclusion loss that compels residual tokens to discard the shared concept. This process allows the target token to capture the pure concept without direct supervision. Extensive experiments demonstrate that ConceptPrism effectively resolves concept entanglement, achieving a significantly improved trade-off between fidelity and alignment.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes