CVAILGJul 4, 2022

ViRel: Unsupervised Visual Relations Discovery with Graph-level Analogy

MIT
arXiv:2207.00590v12 citationsh-index: 148
Originality Highly original
AI Analysis

This addresses the challenge of automatically learning visual relations for scene understanding without predefined labels, enabling generalization to more complicated relational structures, though it is incremental in improving unsupervised methods.

The paper tackles the problem of unsupervised visual relation discovery by introducing ViRel, which uses graph-level analogy to learn relational structures without supervision, achieving over 95% accuracy in relation classification and generalizing to unseen tasks with more complex structures.

Visual relations form the basis of understanding our compositional world, as relationships between visual objects capture key information in a scene. It is then advantageous to learn relations automatically from the data, as learning with predefined labels cannot capture all possible relations. However, current relation learning methods typically require supervision, and are not designed to generalize to scenes with more complicated relational structures than those seen during training. Here, we introduce ViRel, a method for unsupervised discovery and learning of Visual Relations with graph-level analogy. In a setting where scenes within a task share the same underlying relational subgraph structure, our learning method of contrasting isomorphic and non-isomorphic graphs discovers the relations across tasks in an unsupervised manner. Once the relations are learned, ViRel can then retrieve the shared relational graph structure for each task by parsing the predicted relational structure. Using a dataset based on grid-world and the Abstract Reasoning Corpus, we show that our method achieves above 95% accuracy in relation classification, discovers the relation graph structure for most tasks, and further generalizes to unseen tasks with more complicated relational structures.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes