CVMay 29

PRISM: Progressive Reasoning through Iterative Slot Memory for Vision

arXiv:2605.3094279.1h-index: 25
Predicted impact top 30% in CV · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses the problem of limited robustness in vision models under incomplete observations for the computer vision community.

This paper introduces PRISM, a pyramid vision architecture that iteratively refines object-centric visual features by recalling patterns from a learned memory. It achieves competitive performance across image classification, object detection, and semantic segmentation, and demonstrates improved robustness under incomplete observations like occlusion.

Modern vision models process images in a single feed-forward pass, which limits their ability to recover missing evidence or refine uncertain representations under incomplete observations. Inspired by the iterative nature of human perception, we introduce PRISM (Progressive Reasoning through Iterative Slot Memory), a pyramid vision architecture that reasons over images through iterative refinement. At a high level, PRISM groups visual features into object-centric representations, retrieves relevant patterns from a learned memory, and iteratively refines the representation to resolve ambiguity and recover missing information. This organize-recall-refine process operates recurrently across multiple scales, enabling progressive improvement of visual representations. Across standard vision tasks, including image classification, object detection, and semantic segmentation, PRISM achieves competitive performance while demonstrating improved robustness under incomplete observations such as occlusion. These results suggest that iterative reasoning with structured representations and memory is a promising direction for building more resilient and adaptive vision models. Source code and models will be released.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes