REFINE-CONTROL: A Semi-supervised Distillation Method For Conditional Image Generation
This work addresses deployment challenges for conditional image generation on edge devices, offering a solution to reduce costs and privacy concerns, though it appears incremental as it builds on existing distillation methods.
The paper tackles the problem of high resource demands and data scarcity in conditional image generation models for edge devices by proposing Refine-Control, a semi-supervised distillation framework, which achieves significant reductions in computational cost and latency while maintaining high-fidelity generation and controllability, as shown by comparative metrics.
Conditional image generation models have achieved remarkable results by leveraging text-based control to generate customized images. However, the high resource demands of these models and the scarcity of well-annotated data have hindered their deployment on edge devices, leading to enormous costs and privacy concerns, especially when user data is sent to a third party. To overcome these challenges, we propose Refine-Control, a semi-supervised distillation framework. Specifically, we improve the performance of the student model by introducing a tri-level knowledge fusion loss to transfer different levels of knowledge. To enhance generalization and alleviate dataset scarcity, we introduce a semi-supervised distillation method utilizing both labeled and unlabeled data. Our experiments reveal that Refine-Control achieves significant reductions in computational cost and latency, while maintaining high-fidelity generation capabilities and controllability, as quantified by comparative metrics.