CLNov 12, 2025

Spider4SSC & S2CLite: A text-to-multi-query-language dataset using lightweight ontology-agnostic SPARQL to Cypher parser

arXiv:2511.09354v1h-index: 2Has Code
Originality Incremental advance
AI Analysis

This work addresses the need for efficient cross-database query translation in data integration and knowledge graph applications, though it is incremental as it builds on existing datasets and methods.

The paper tackles the problem of translating SPARQL queries to Cypher queries by introducing S2CLite, a lightweight, ontology-agnostic parser, which achieves a parsing accuracy of 77.8% on Spider4SPARQL compared to 44.2% by the state-of-the-art, and generates the Spider4SSC dataset with 4525 unique questions and 2581 matching queries across SQL, SPARQL, and Cypher.

We present Spider4SSC dataset and S2CLite parsing tool. S2CLite is a lightweight, ontology-agnostic parser that translates SPARQL queries into Cypher queries, enabling both in-situ and large-scale SPARQL to Cypher translation. Unlike existing solutions, S2CLite is purely rule-based (inspired by traditional programming language compilers) and operates without requiring an RDF graph or external tools. Experiments conducted on the BSBM42 and Spider4SPARQL datasets show that S2CLite significantly reduces query parsing errors, achieving a total parsing accuracy of 77.8% on Spider4SPARQL compared to 44.2% by the state-of-the-art S2CTrans. Furthermore, S2CLite achieved a 96.6\% execution accuracy on the intersecting subset of queries parsed by both parsers, outperforming S2CTrans by 7.3%. We further use S2CLite to parse Spider4SPARQL queries to Cypher and generate Spider4SSC, a unified Text-to-Query language (SQL, SPARQL, Cypher) dataset with 4525 unique questions and 3 equivalent sets of 2581 matching queries (SQL, SPARQL and Cypher). We open-source S2CLite for further development on GitHub (github.com/vejvarm/S2CLite) and provide the clean Spider4SSC dataset for download.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes