VGStore: A Multimodal Extension to SPARQL for Querying RDF Scene Graph
This work addresses a specific problem for researchers and practitioners in semantic web and multimodal AI by enabling more expressive queries on RDF-stored scene graphs, though it is incremental as it builds on existing SPARQL technology.
The paper tackled the problem of querying multimodal scene graphs by extending SPARQL to handle implicit relationships like semantic similarity and spatial relations, resulting in a system called VGStore that demonstrates effectiveness in customized queries and data display.
Semantic Web technology has successfully facilitated many RDF models with rich data representation methods. It also has the potential ability to represent and store multimodal knowledge bases such as multimodal scene graphs. However, most existing query languages, especially SPARQL, barely explore the implicit multimodal relationships like semantic similarity, spatial relations, etc. We first explored this issue by organizing a large-scale scene graph dataset, namely Visual Genome, in the RDF graph database. Based on the proposed RDF-stored multimodal scene graph, we extended SPARQL queries to answer questions containing relational reasoning about color, spatial, etc. Further demo (i.e., VGStore) shows the effectiveness of customized queries and displaying multimodal data.