Predicate correlation learning for scene graph generation
This work addresses the long-tailed distribution problem in SGG for computer vision applications, representing an incremental improvement over existing methods.
The paper tackles the performance gap between head and tail predicate classes in Scene Graph Generation (SGG) by proposing a Predicate Correlation Learning (PCL) method that addresses semantic overlap and long-tailed data distribution, resulting in significant improvement in tail class performance on the Visual Genome benchmark.
For a typical Scene Graph Generation (SGG) method, there is often a large gap in the performance of the predicates' head classes and tail classes. This phenomenon is mainly caused by the semantic overlap between different predicates as well as the long-tailed data distribution. In this paper, a Predicate Correlation Learning (PCL) method for SGG is proposed to address the above two problems by taking the correlation between predicates into consideration. To describe the semantic overlap between strong-correlated predicate classes, a Predicate Correlation Matrix (PCM) is defined to quantify the relationship between predicate pairs, which is dynamically updated to remove the matrix's long-tailed bias. In addition, PCM is integrated into a Predicate Correlation Loss function ($L_{PC}$) to reduce discouraging gradients of unannotated classes. The proposed method is evaluated on Visual Genome benchmark, where the performance of the tail classes is significantly improved when built on the existing methods.