CVAISep 5, 2025

COGITAO: A Visual Reasoning Framework To Study Compositionality & Generalization

arXiv:2509.05249v1h-index: 8Has Code
Originality Incremental advance
AI Analysis

This addresses the limitation of machine learning models in composing learned concepts for novel settings, which is incremental as it builds on existing benchmarks like ARC-AGI.

The authors tackled the problem of compositionality and generalization in visual reasoning by introducing COGITAO, a framework that generates millions of unique rule-based tasks, and found that state-of-the-art vision models consistently fail to generalize to novel combinations despite strong in-domain performance.

The ability to compose learned concepts and apply them in novel settings is key to human intelligence, but remains a persistent limitation in state-of-the-art machine learning models. To address this issue, we introduce COGITAO, a modular and extensible data generation framework and benchmark designed to systematically study compositionality and generalization in visual domains. Drawing inspiration from ARC-AGI's problem-setting, COGITAO constructs rule-based tasks which apply a set of transformations to objects in grid-like environments. It supports composition, at adjustable depth, over a set of 28 interoperable transformations, along with extensive control over grid parametrization and object properties. This flexibility enables the creation of millions of unique task rules -- surpassing concurrent datasets by several orders of magnitude -- across a wide range of difficulties, while allowing virtually unlimited sample generation per rule. We provide baseline experiments using state-of-the-art vision models, highlighting their consistent failures to generalize to novel combinations of familiar elements, despite strong in-domain performance. COGITAO is fully open-sourced, including all code and datasets, to support continued research in this field.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes