CVAILGJun 21, 2025

SynDaCaTE: A Synthetic Dataset For Evaluating Part-Whole Hierarchical Inference

arXiv:2506.17558v1h-index: 27
Originality Synthesis-oriented
AI Analysis

This provides a tool for researchers in computer vision to better test and design models for hierarchical inference, though it is incremental as it focuses on evaluation rather than a new model.

The authors tackled the difficulty of evaluating whether capsule networks actually learn part-whole hierarchies by creating SynDaCaTE, a synthetic dataset, and used it to identify a bottleneck in an existing model and show that permutation-equivariant self-attention is effective for parts-to-wholes inference.

Learning to infer object representations, and in particular part-whole hierarchies, has been the focus of extensive research in computer vision, in pursuit of improving data efficiency, systematic generalisation, and robustness. Models which are \emph{designed} to infer part-whole hierarchies, often referred to as capsule networks, are typically trained end-to-end on supervised tasks such as object classification, in which case it is difficult to evaluate whether such a model \emph{actually} learns to infer part-whole hierarchies, as claimed. To address this difficulty, we present a SYNthetic DAtaset for CApsule Testing and Evaluation, abbreviated as SynDaCaTE, and establish its utility by (1) demonstrating the precise bottleneck in a prominent existing capsule model, and (2) demonstrating that permutation-equivariant self-attention is highly effective for parts-to-wholes inference, which motivates future directions for designing effective inductive biases for computer vision.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes