CVMar 17, 2024

Zippo: Zipping Color and Transparency Distributions into a Single Diffusion Model

arXiv:2403.11077v22 citationsh-index: 16
Originality Incremental advance
AI Analysis

This work addresses the need for efficient transparent image generation in computer vision and graphics, representing an incremental advancement by adapting diffusion models to handle multiple modalities.

The paper tackles the problem of generating transparent images (RGB images with alpha mattes) by proposing Zippo, a unified diffusion model that zips color and transparency distributions, enabling tasks like text-conditioned generation and translation between modalities with plausible results.

Beyond the superiority of the text-to-image diffusion model in generating high-quality images, recent studies have attempted to uncover its potential for adapting the learned semantic knowledge to visual perception tasks. In this work, instead of translating a generative diffusion model into a visual perception model, we explore to retain the generative ability with the perceptive adaptation. To accomplish this, we present Zippo, a unified framework for zipping the color and transparency distributions into a single diffusion model by expanding the diffusion latent into a joint representation of RGB images and alpha mattes. By alternatively selecting one modality as the condition and then applying the diffusion process to the counterpart modality, Zippo is capable of generating RGB images from alpha mattes and predicting transparency from input images. In addition to single-modality prediction, we propose a modality-aware noise reassignment strategy to further empower Zippo with jointly generating RGB images and its corresponding alpha mattes under the text guidance. Our experiments showcase Zippo's ability of efficient text-conditioned transparent image generation and present plausible results of Matte-to-RGB and RGB-to-Matte translation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes