pyRDF2Vec: A Python Implementation and Extension of RDF2Vec
This work provides a convenient tool for data scientists working with RDF data, but it is incremental as it reimplements and extends an existing algorithm.
The paper introduces pyRDF2Vec, a Python package that reimplements the RDF2Vec algorithm and its extensions, simplifying its use for data scientists and including optimizations for more efficient walk extraction.
This paper introduces pyRDF2Vec, a Python software package that reimplements the well-known RDF2Vec algorithm along with several of its extensions. By making the algorithm available in the most popular data science language, and by bundling all extensions into a single place, the use of RDF2Vec is simplified for data scientists. The package is released under a MIT license and structured in such a way to foster further research into sampling, walking, and embedding strategies, which are vital components of the RDF2Vec algorithm. Several optimisations have been implemented in \texttt{pyRDF2Vec} that allow for more efficient walk extraction than the original algorithm. Furthermore, best practices in terms of code styling, testing, and documentation were applied such that the package is future-proof as well as to facilitate external contributions.