CVAIMay 5, 2022

Text to artistic image generation

arXiv:2205.02439v12 citationsh-index: 5
Originality Synthesis-oriented
AI Analysis

This work addresses a specific accessibility problem for people with disabilities, but it is incremental as it relies on existing methods without introducing new algorithmic innovations.

The paper tackles the challenge of generating artistic images from text descriptions, particularly for individuals with hand disabilities, by proposing a three-step pipeline that combines text-to-image generation, genre classification, and style transfer, achieving functional image creation without direct training on paired text-art datasets.

Painting is one of the ways for people to express their ideas, but what if people with disabilities in hands want to paint? To tackle this challenge, we create an end-to-end solution that can generate artistic images from text descriptions. However, due to the lack of datasets with paired text description and artistic images, it is hard to directly train an algorithm which can create art based on text input. To address this issue, we split our task into three steps: (1) Generate a realistic image from a text description by using Dynamic Memory Generative Adversarial Network (arXiv:1904.01310), (2) Classify the image as a genre that exists in the WikiArt dataset using Resnet (arXiv: 1512.03385), (3) Select a style that is compatible with the genre and transfer it to the generated image by using neural artistic stylization network (arXiv:1705.06830).

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes