CLAICVAug 19, 2022

Text to Image Generation: Leaving no Language Behind

arXiv:2208.09333v217 citationsh-index: 33
Originality Synthesis-oriented
AI Analysis

This addresses the problem of linguistic bias in AI for non-native English speakers and aims to preserve linguistic diversity, though it is incremental as it explores existing models.

The paper investigated the performance degradation of three popular text-to-image generators when using non-English languages, especially less widely used ones, and discussed improvements to ensure consistent performance across languages.

One of the latest applications of Artificial Intelligence (AI) is to generate images from natural language descriptions. These generators are now becoming available and achieve impressive results that have been used for example in the front cover of magazines. As the input to the generators is in the form of a natural language text, a question that arises immediately is how these models behave when the input is written in different languages. In this paper we perform an initial exploration of how the performance of three popular text-to-image generators depends on the language. The results show that there is a significant performance degradation when using languages other than English, especially for languages that are not widely used. This observation leads us to discuss different alternatives on how text-to-image generators can be improved so that performance is consistent across different languages. This is fundamental to ensure that this new technology can be used by non-native English speakers and to preserve linguistic diversity.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes