RDFGraphGen: An RDF Graph Generator based on SHACL Shapes
This addresses a need for researchers and developers working with RDF-based applications by providing a tool to generate synthetic data, though it is incremental as it builds on existing SHACL standards.
The paper tackled the problem of lacking domain-specific RDF datasets with certain characteristics for developing and testing applications by proposing RDFGraphGen, an open-source RDF graph generator that uses SHACL shapes to create synthetic graphs, and the results show it is scalable and can generate graphs of various sizes in any domain.
Developing and testing modern RDF-based applications often requires access to RDF datasets with certain characteristics. Unfortunately, it is very difficult to publicly find domain-specific knowledge graphs that conform to a particular set of characteristics. Hence, in this paper we propose RDFGraphGen, an open-source RDF graph generator that uses characteristics provided in the form of SHACL (Shapes Constraint Language) shapes to generate synthetic RDF graphs. RDFGraphGen is domain-agnostic, with configurable graph structure, value constraints, and distributions. It also comes with a number of predefined values for popular schema.org classes and properties, for more realistic graphs. Our results show that RDFGraphGen is scalable and can generate small, medium, and large RDF graphs in any domain.