Constellation: Learning relational abstractions over objects for compositional imagination
This addresses a bottleneck in bridging perception with reasoning for AI systems, though it is a first step and likely incremental in the context of existing slot-based models.
The paper tackles the problem of learning structured representations of visual scenes by introducing Constellation, a network that learns relational abstractions over objects, enabling generalization and potential for abstract reasoning and imagination.
Learning structured representations of visual scenes is currently a major bottleneck to bridging perception with reasoning. While there has been exciting progress with slot-based models, which learn to segment scenes into sets of objects, learning configurational properties of entire groups of objects is still under-explored. To address this problem, we introduce Constellation, a network that learns relational abstractions of static visual scenes, and generalises these abstractions over sensory particularities, thus offering a potential basis for abstract relational reasoning. We further show that this basis, along with language association, provides a means to imagine sensory content in new ways. This work is a first step in the explicit representation of visual relationships and using them for complex cognitive procedures.