CVNov 29, 2023

Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models

arXiv:2311.17919v244 citationsh-index: 6
Originality Incremental advance
AI Analysis

This work addresses the challenge of generating optical illusions for applications in art, entertainment, and visual perception research, though it is incremental as it builds on existing diffusion models.

The paper tackles the problem of synthesizing multi-view optical illusions, such as images that change appearance under transformations like flips or rotations, by proposing a zero-shot method using off-the-shelf text-to-image diffusion models, achieving effective and flexible results as demonstrated through qualitative and quantitative analysis.

We address the problem of synthesizing multi-view optical illusions: images that change appearance upon a transformation, such as a flip or rotation. We propose a simple, zero-shot method for obtaining these illusions from off-the-shelf text-to-image diffusion models. During the reverse diffusion process, we estimate the noise from different views of a noisy image, and then combine these noise estimates together and denoise the image. A theoretical analysis suggests that this method works precisely for views that can be written as orthogonal transformations, of which permutations are a subset. This leads to the idea of a visual anagram--an image that changes appearance under some rearrangement of pixels. This includes rotations and flips, but also more exotic pixel permutations such as a jigsaw rearrangement. Our approach also naturally extends to illusions with more than two views. We provide both qualitative and quantitative results demonstrating the effectiveness and flexibility of our method. Please see our project webpage for additional visualizations and results: https://dangeng.github.io/visual_anagrams/

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes