QPredSGG: Hybrid Quantum Predicate Learning for Long-Tailed Scene Graph Generation

Prerana Ramkumar, Nouhaila Innan, Muhammad Shafique

arXiv:2606.0468988.9

AI Analysis

For researchers in scene graph generation and quantum machine learning, this work provides a proof-of-concept that hybrid quantum models can improve long-tail predicate classification with significantly fewer parameters.

The paper introduces a hybrid quantum predicate classifier for Scene Graph Generation that replaces the classical predicate head with a quantum one, achieving an mR@100 of 57.25% (vs. 41.1% for classical) on Visual Genome 150 with only 96 trainable quantum parameters, demonstrating parameter-efficient long-tail relational classification.

Scene Graph Generation (SGG) requires relational reasoning over objects and their interactions, but performance is often limited by severe long-tail predicate imbalance. Classical SGG models frequently rely on dataset statistics, leading to biased predictions toward frequent relations rather than fine-grained semantic predicates. Although existing debiasing strategies improve mean recall, predicate classification in current frameworks still often depends on large classical decision modules with high parameter cost. This work introduces a hybrid quantum predicate classifier for SGG by replacing the classical predicate head in Causal Feature Enhancement Network (CFEN) with a Quantum Predicate Head (QP-Head) trained using weighted cross-entropy. To the best of our knowledge, this is among the first studies to evaluate a hybrid quantum architecture for scene graph predicate classification on Visual Genome 150. We study the effect of qubit count, encoding strategy, entangling structure, and circuit depth on relational prediction. The best 4-qubit QP-Head uses Amplitude Embedding and Strongly Entangling Layers to compress 4096-dimensional pair features into a 16-dimensional quantum-compatible representation, corresponding to a 256$\times$ reduction. It achieves an mR@100 of 57.25%, compared with 41.1% for the classical CFEN reference, while using only 96 trainable quantum parameters. Scaling to 8 qubits maintains strong long-tail performance, reaching an mR@100 of 55.38% with 384 quantum parameters, while the depth analysis shows a trade-off between expressibility and runtime overhead. These results suggest that compact hybrid quantum predicate heads can support parameter-efficient long-tail relational classification in complex visual reasoning tasks.

View on arXiv PDF

Similar