CVAIMay 21, 2024

Scene Graph Generation Strategy with Co-occurrence Knowledge and Learnable Term Frequency

arXiv:2405.12648v18 citationsh-index: 11ICML
Originality Incremental advance
AI Analysis

This work addresses scene graph generation for image understanding, offering incremental improvements by incorporating co-occurrence knowledge and handling long-tail issues.

The paper tackled the problems of ignoring object co-occurrence and the long-tail distribution in scene graph generation by proposing CooK and learnable TF-l-IDF, resulting in a performance improvement of up to 3.8% on the SGGen benchmark compared to state-of-the-art models.

Scene graph generation (SGG) is an important task in image understanding because it represents the relationships between objects in an image as a graph structure, making it possible to understand the semantic relationships between objects intuitively. Previous SGG studies used a message-passing neural networks (MPNN) to update features, which can effectively reflect information about surrounding objects. However, these studies have failed to reflect the co-occurrence of objects during SGG generation. In addition, they only addressed the long-tail problem of the training dataset from the perspectives of sampling and learning methods. To address these two problems, we propose CooK, which reflects the Co-occurrence Knowledge between objects, and the learnable term frequency-inverse document frequency (TF-l-IDF) to solve the long-tail problem. We applied the proposed model to the SGG benchmark dataset, and the results showed a performance improvement of up to 3.8% compared with existing state-of-the-art models in SGGen subtask. The proposed method exhibits generalization ability from the results obtained, showing uniform performance improvement for all MPNN models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes