NC CVFeb 24, 2025

Unraveling the geometry of visual relational reasoning

Jiaqi Shang, Gabriel Kreiman, Haim Sompolinsky

arXiv:2502.17382v21 citationsh-index: 82

Originality Incremental advance

AI Analysis

This work addresses the problem of improving flexible reasoning in AI for visual tasks, offering incremental insights through a new benchmark and geometric approach.

The paper tackled the challenge of abstract relational reasoning in neural networks by introducing the SimplifiedRPM benchmark and testing models, finding that the Scattering Compositional Learner (SCL) generalized best and aligned closely with human behavior, with a geometric analysis revealing key representation properties and a trade-off between signal and dimensionality.

Humans readily generalize abstract relations, such as recognizing "constant" in shape or color, whereas neural networks struggle, limiting their flexible reasoning. To investigate mechanisms underlying such generalization, we introduce SimplifiedRPM, a novel benchmark for systematically evaluating abstract relational reasoning, addressing limitations in prior datasets. In parallel, we conduct human experiments to quantify relational difficulty, enabling direct model-human comparisons. Testing four models, ResNet-50, Vision Transformer, Wild Relation Network, and Scattering Compositional Learner (SCL), we find that SCL generalizes best and most closely aligns with human behavior. Using a geometric approach, we identify key representation properties that accurately predict generalization and uncover a fundamental trade-off between signal and dimensionality: novel relations compress into training-induced subspaces. Layer-wise analysis reveals where relational structure emerges, highlights bottlenecks, and generates concrete hypotheses about abstract reasoning in the brain. Motivated by these insights, we propose SNRloss, a novel objective explicitly balancing representation geometry. Our results establish a geometric foundation for relational reasoning, paving the way for more human-like visual reasoning in AI and opening promising avenues for extending geometric analysis to broader cognitive tasks.

View on arXiv PDF

Similar