Learning to Color from Language
This work addresses the need for user-controllable image colorization in computer vision, offering an incremental improvement by integrating language for enhanced manipulation.
The paper tackles the problem of automatic colorization of greyscale images by conditioning on language, enabling users to manipulate colorizations through captions. It presents two architectures that produce more accurate and plausible results than language-agnostic methods, allowing dramatic color changes by adjusting descriptive words in captions.
Automatic colorization is the process of adding color to greyscale images. We condition this process on language, allowing end users to manipulate a colorized image by feeding in different captions. We present two different architectures for language-conditioned colorization, both of which produce more accurate and plausible colorizations than a language-agnostic version. Through this language-based framework, we can dramatically alter colorizations by manipulating descriptive color words in captions.