CVLGAug 16, 2023

Painter: Teaching Auto-regressive Language Models to Draw Sketches

arXiv:2308.08520v16 citationsh-index: 38
Originality Incremental advance
AI Analysis

This work introduces a novel application of LLMs to image generation, specifically for creating sketches from text, which could benefit creative and design tools.

The authors tackled the problem of generating sketches from text descriptions by fine-tuning a pre-trained large language model to produce brush strokes auto-regressively, achieving capabilities in sketch generation, object removal, and detection with encouraging results.

Large language models (LLMs) have made tremendous progress in natural language understanding and they have also been successfully adopted in other domains such as computer vision, robotics, reinforcement learning, etc. In this work, we apply LLMs to image generation tasks by directly generating the virtual brush strokes to paint an image. We present Painter, an LLM that can convert user prompts in text description format to sketches by generating the corresponding brush strokes in an auto-regressive way. We construct Painter based on off-the-shelf LLM that is pre-trained on a large text corpus, by fine-tuning it on the new task while preserving language understanding capabilities. We create a dataset of diverse multi-object sketches paired with textual prompts that covers several object types and tasks. Painter can generate sketches from text descriptions, remove objects from canvas, and detect and classify objects in sketches. Although this is an unprecedented pioneering work in using LLMs for auto-regressive image generation, the results are very encouraging.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes