AICVMar 30, 2025

GenVP: Generating Visual Puzzles with Contrastive Hierarchical VAEs

arXiv:2503.23598v1h-index: 9ICLR
Originality Incremental advance
AI Analysis

This addresses the problem of limited machine creativity in visual reasoning for AI researchers, though it is incremental as it builds on existing generative methods.

The paper tackles the challenge of generating new Raven's Progressive Matrices puzzles, a task beyond solving existing ones, and achieves state-of-the-art performance in puzzle-solving accuracy and out-of-distribution generalization across 22 scenarios.

Raven's Progressive Matrices (RPMs) is an established benchmark to examine the ability to perform high-level abstract visual reasoning (AVR). Despite the current success of algorithms that solve this task, humans can generalize beyond a given puzzle and create new puzzles given a set of rules, whereas machines remain locked in solving a fixed puzzle from a curated choice list. We propose Generative Visual Puzzles (GenVP), a framework to model the entire RPM generation process, a substantially more challenging task. Our model's capability spans from generating multiple solutions for one specific problem prompt to creating complete new puzzles out of the desired set of rules. Experiments on five different datasets indicate that GenVP achieves state-of-the-art (SOTA) performance both in puzzle-solving accuracy and out-of-distribution (OOD) generalization in 22 OOD scenarios. Compared to SOTA generative approaches, which struggle to solve RPMs when the feasible solution space increases, GenVP efficiently generalizes to these challenging setups. Moreover, our model demonstrates the ability to produce a wide range of complete RPMs given a set of abstract rules by effectively capturing the relationships between abstract rules and visual object properties.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes