CV AI LGJun 21, 2025

SynDaCaTE: A Synthetic Dataset For Evaluating Part-Whole Hierarchical Inference

arXiv:2506.17558v1h-index: 27

Originality Synthesis-oriented

AI Analysis

This provides a tool for researchers in computer vision to better test and design models for hierarchical inference, though it is incremental as it focuses on evaluation rather than a new model.

The authors tackled the difficulty of evaluating whether capsule networks actually learn part-whole hierarchies by creating SynDaCaTE, a synthetic dataset, and used it to identify a bottleneck in an existing model and show that permutation-equivariant self-attention is effective for parts-to-wholes inference.

Learning to infer object representations, and in particular part-whole hierarchies, has been the focus of extensive research in computer vision, in pursuit of improving data efficiency, systematic generalisation, and robustness. Models which are \emph{designed} to infer part-whole hierarchies, often referred to as capsule networks, are typically trained end-to-end on supervised tasks such as object classification, in which case it is difficult to evaluate whether such a model \emph{actually} learns to infer part-whole hierarchies, as claimed. To address this difficulty, we present a SYNthetic DAtaset for CApsule Testing and Evaluation, abbreviated as SynDaCaTE, and establish its utility by (1) demonstrating the precise bottleneck in a prominent existing capsule model, and (2) demonstrating that permutation-equivariant self-attention is highly effective for parts-to-wholes inference, which motivates future directions for designing effective inductive biases for computer vision.

View on arXiv PDF

Similar