CVJun 27, 2024

Enhanced Data Transfer Cooperating with Artificial Triplets for Scene Graph Generation

arXiv:2406.19316v2
Originality Incremental advance
AI Analysis

This work addresses the challenge of insufficient supervision for informative relational triplets in SGG, which is crucial for improving scene understanding in computer vision, though it is incremental as it builds on existing dataset enhancement techniques.

The paper tackles the problem of poor performance in Scene Graph Generation (SGG) for informative relational triplets due to inadequate training samples by proposing two dataset enhancement modules, Feature Space Triplet Augmentation (FSTA) and Soft Transfer, which achieve the highest mean of Recall and mean Recall among model-agnostic methods on the Visual Genome dataset.

This work focuses on training dataset enhancement of informative relational triplets for Scene Graph Generation (SGG). Due to the lack of effective supervision, the current SGG model predictions perform poorly for informative relational triplets with inadequate training samples. Therefore, we propose two novel training dataset enhancement modules: Feature Space Triplet Augmentation (FSTA) and Soft Transfer. FSTA leverages a feature generator trained to generate representations of an object in relational triplets. The biased prediction based sampling in FSTA efficiently augments artificial triplets focusing on the challenging ones. In addition, we introduce Soft Transfer, which assigns soft predicate labels to general relational triplets to make more supervisions for informative predicate classes effectively. Experimental results show that integrating FSTA and Soft Transfer achieve high levels of both Recall and mean Recall in Visual Genome dataset. The mean of Recall and mean Recall is the highest among all the existing model-agnostic methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes