CVAIJan 23, 2024

SGTR+: End-to-end Scene Graph Generation with Transformer

arXiv:2401.12835v123 citationsh-index: 20Has CodeIEEE Trans Pattern Anal Mach Intell
Originality Incremental advance
AI Analysis

This work addresses the challenge of generating scene graphs for visual understanding, which is incremental as it builds on existing SGG methods with a novel bipartite graph approach.

The authors tackled the problem of Scene Graph Generation (SGG) by proposing a transformer-based end-to-end framework that formulates it as a bipartite graph construction, achieving state-of-the-art or comparable performance on three benchmarks with higher inference efficiency.

Scene Graph Generation (SGG) remains a challenging visual understanding task due to its compositional property. Most previous works adopt a bottom-up, two-stage or point-based, one-stage approach, which often suffers from high time complexity or suboptimal designs. In this work, we propose a novel SGG method to address the aforementioned issues, formulating the task as a bipartite graph construction problem. To address the issues above, we create a transformer-based end-to-end framework to generate the entity and entity-aware predicate proposal set, and infer directed edges to form relation triplets. Moreover, we design a graph assembling module to infer the connectivity of the bipartite scene graph based on our entity-aware structure, enabling us to generate the scene graph in an end-to-end manner. Based on bipartite graph assembling paradigm, we further propose a new technical design to address the efficacy of entity-aware modeling and optimization stability of graph assembling. Equipped with the enhanced entity-aware design, our method achieves optimal performance and time-complexity. Extensive experimental results show that our design is able to achieve the state-of-the-art or comparable performance on three challenging benchmarks, surpassing most of the existing approaches and enjoying higher efficiency in inference. Code is available: https://github.com/Scarecrow0/SGTR

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes