AICVCYJun 12, 2023

Explaining CLIP through Co-Creative Drawings and Interaction

arXiv:2306.07429v14 citationsh-index: 6
Originality Synthesis-oriented
AI Analysis

This work provides insights into CLIP model interpretability through an artistic application, but it is incremental as it builds on existing methods without major technical advancements.

The paper analyzed a visual archive of drawings from an interactive robotic art installation that used the CLIPdraw model to transform audience dreams into images, proposing four groupings to describe CLIP-generated results based on concept representation accuracy. It argues these clusters enhance understanding of the neural model, showcasing unexpected and dream-like outputs from the system.

This paper analyses a visual archive of drawings produced by an interactive robotic art installation where audience members narrated their dreams into a system powered by CLIPdraw deep learning (DL) model that interpreted and transformed their dreams into images. The resulting archive of prompt-image pairs were examined and clustered based on concept representation accuracy. As a result of the analysis, the paper proposes four groupings for describing and explaining CLIP-generated results: clear concept, text-to-text as image, indeterminacy and confusion, and lost in translation. This article offers a glimpse into a collection of dreams interpreted, mediated and given form by Artificial Intelligence (AI), showcasing oftentimes unexpected, visually compelling or, indeed, the dream-like output of the system, with the emphasis on processes and results of translations between languages, sign-systems and various modules of the installation. In the end, the paper argues that proposed clusters support better understanding of the neural model.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes