CVAIDec 8, 2023

Fine-Tuning InstructPix2Pix for Advanced Image Colorization

arXiv:2312.04780v13 citationsh-index: 1Has Code
Originality Incremental advance
AI Analysis

This work addresses colorization for human images, but it is incremental as it applies fine-tuning to an existing model for a specific domain.

This paper tackled the problem of human image colorization by fine-tuning the InstructPix2Pix model, resulting in improved performance over the original model on multiple quantitative metrics and more realistic qualitative outputs.

This paper presents a novel approach to human image colorization by fine-tuning the InstructPix2Pix model, which integrates a language model (GPT-3) with a text-to-image model (Stable Diffusion). Despite the original InstructPix2Pix model's proficiency in editing images based on textual instructions, it exhibits limitations in the focused domain of colorization. To address this, we fine-tuned the model using the IMDB-WIKI dataset, pairing black-and-white images with a diverse set of colorization prompts generated by ChatGPT. This paper contributes by (1) applying fine-tuning techniques to stable diffusion models specifically for colorization tasks, and (2) employing generative models to create varied conditioning prompts. After finetuning, our model outperforms the original InstructPix2Pix model on multiple metrics quantitatively, and we produce more realistically colored images qualitatively. The code for this project is provided on the GitHub Repository https://github.com/AllenAnZifeng/DeepLearning282.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes