CVAILGNov 30, 2022

Multiresolution Textual Inversion

arXiv:2211.17115v138 citationsh-index: 71Has Code
Originality Incremental advance
AI Analysis

This work addresses the problem of controlling detail and composition in text-to-image generation for users, though it appears incremental as it builds on existing Textual Inversion methods.

The authors extended Textual Inversion to learn pseudo-words representing concepts at multiple resolutions, enabling image generation with varying levels of detail and manipulation via language, such as producing exact objects or rough outlines based on resolution parameters.

We extend Textual Inversion to learn pseudo-words that represent a concept at different resolutions. This allows us to generate images that use the concept with different levels of detail and also to manipulate different resolutions using language. Once learned, the user can generate images at different levels of agreement to the original concept; "A photo of $S^*(0)$" produces the exact object while the prompt "A photo of $S^*(0.8)$" only matches the rough outlines and colors. Our framework allows us to generate images that use different resolutions of an image (e.g. details, textures, styles) as separate pseudo-words that can be composed in various ways. We open-soure our code in the following URL: https://github.com/giannisdaras/multires_textual_inversion

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes