CVDec 12, 2024

LoRACLR: Contrastive Adaptation for Customization of Diffusion Models

Enis Simsar, Thomas Hofmann, Federico Tombari, Pinar Yanardag

arXiv:2412.09622v217.321 citationsh-index: 15CVPR

Originality Incremental advance

AI Analysis

This addresses a bottleneck in personalized image generation for users needing to combine multiple concepts efficiently, though it is incremental as it builds on existing LoRA fine-tuning methods.

The paper tackles the problem of attribute entanglement in multi-concept image generation by introducing LoRACLR, a method that merges multiple LoRA models into a single unified model without additional fine-tuning, resulting in accurate and scalable multi-concept synthesis.

Recent advances in text-to-image customization have enabled high-fidelity, context-rich generation of personalized images, allowing specific concepts to appear in a variety of scenarios. However, current methods struggle with combining multiple personalized models, often leading to attribute entanglement or requiring separate training to preserve concept distinctiveness. We present LoRACLR, a novel approach for multi-concept image generation that merges multiple LoRA models, each fine-tuned for a distinct concept, into a single, unified model without additional individual fine-tuning. LoRACLR uses a contrastive objective to align and merge the weight spaces of these models, ensuring compatibility while minimizing interference. By enforcing distinct yet cohesive representations for each concept, LoRACLR enables efficient, scalable model composition for high-quality, multi-concept image synthesis. Our results highlight the effectiveness of LoRACLR in accurately merging multiple concepts, advancing the capabilities of personalized image generation.

View on arXiv PDF

Similar