AIMar 27, 2023
Describing and Organizing Semantic Web and Machine Learning Systems in the SWeMLS-KGFajar J. Ekaputra, Majlinda Llugiqi, Marta Sabou et al.
In line with the general trend in artificial intelligence research to create intelligent systems that combine learning and symbolic components, a new sub-area has emerged that focuses on combining machine learning (ML) components with techniques developed by the Semantic Web (SW) community - Semantic Web Machine Learning (SWeML for short). Due to its rapid growth and impact on several communities in the last two decades, there is a need to better understand the space of these SWeML Systems, their characteristics, and trends. Yet, surveys that adopt principled and unbiased approaches are missing. To fill this gap, we performed a systematic study and analyzed nearly 500 papers published in the last decade in this area, where we focused on evaluating architectural, and application-specific features. Our analysis identified a rapidly growing interest in SWeML Systems, with a high impact on several application domains and tasks. Catalysts for this rapid growth are the increased application of deep learning and knowledge graph technologies. By leveraging the in-depth understanding of this area acquired through this study, a further key contribution of this paper is a classification system for SWeML Systems which we publish as ontology.
DLMar 28, 2022
The Digitalization of Bioassays in the Open Research Knowledge GraphJennifer D'Souza, Anita Monteverdi, Muhammad Haris et al.
Background: Recent years are seeing a growing impetus in the semantification of scholarly knowledge at the fine-grained level of scientific entities in knowledge graphs. The Open Research Knowledge Graph (ORKG) https://www.orkg.org/ represents an important step in this direction, with thousands of scholarly contributions as structured, fine-grained, machine-readable data. There is a need, however, to engender change in traditional community practices of recording contributions as unstructured, non-machine-readable text. For this in turn, there is a strong need for AI tools designed for scientists that permit easy and accurate semantification of their scholarly contributions. We present one such tool, ORKG-assays. Implementation: ORKG-assays is a freely available AI micro-service in ORKG written in Python designed to assist scientists obtain semantified bioassays as a set of triples. It uses an AI-based clustering algorithm which on gold-standard evaluations over 900 bioassays with 5,514 unique property-value pairs for 103 predicates shows competitive performance. Results and Discussion: As a result, semantified assay collections can be surveyed on the ORKG platform via tabulation or chart-based visualizations of key property values of the chemicals and compounds offering smart knowledge access to biochemists and pharmaceutical researchers in the advancement of drug development.
DLApr 14, 2025
SciMantify -- A Hybrid Approach for the Evolving Semantification of Scientific KnowledgeLena John, Kheir Eddine Farfar, Sören Auer et al.
Scientific publications, primarily digitized as PDFs, remain static and unstructured, limiting the accessibility and reusability of the contained knowledge. At best, scientific knowledge from publications is provided in tabular formats, which lack semantic context. A more flexible, structured, and semantic representation is needed to make scientific knowledge understandable and processable by both humans and machines. We propose an evolution model of knowledge representation, inspired by the 5-star Linked Open Data (LOD) model, with five stages and defined criteria to guide the stepwise transition from a digital artifact, such as a PDF, to a semantic representation integrated in a knowledge graph (KG). Based on an exemplary workflow implementing the entire model, we developed a hybrid approach, called SciMantify, leveraging tabular formats of scientific knowledge, e.g., results from secondary studies, to support its evolving semantification. In the approach, humans and machines collaborate closely by performing semantic annotation tasks (SATs) and refining the results to progressively improve the semantic representation of scientific knowledge. We implemented the approach in the Open Research Knowledge Graph (ORKG), an established platform for improving the findability, accessibility, interoperability, and reusability of scientific knowledge. A preliminary user experiment showed that the approach simplifies the preprocessing of scientific knowledge, reduces the effort for the evolving semantification, and enhances the knowledge representation through better alignment with the KG structures.
DBApr 8, 2025
Rosetta Statements: Simplifying FAIR Knowledge Graph Construction with a User-Centered ApproachLars Vogt, Kheir Eddine Farfar, Pallavi Karanth et al.
Machines need data and metadata to be machine-actionable and FAIR (findable, accessible, interoperable, reusable) to manage increasing data volumes. Knowledge graphs and ontologies are key to this, but their use is hampered by high access barriers due to required prior knowledge in semantics and data modelling. The Rosetta Statement approach proposes modeling English natural language statements instead of a mind-independent reality. We propose a metamodel for creating semantic schema patterns for simple statement types. The approach supports versioning of statements and provides a detailed editing history. Each Rosetta Statement pattern has a dynamic label for displaying statements as natural language sentences. Implemented in the Open Research Knowledge Graph (ORKG) as a use case, this approach allows domain experts to define data schema patterns without needing semantic knowledge. Future plans include combining Rosetta Statements with semantic units to organize ORKG into meaningful subgraphs, improving usability. A search interface for querying statements without needing SPARQL or Cypher knowledge is also planned, along with tools for data entry and display using Large Language Models. The Rosetta Statement metamodel supports a three-step knowledge graph construction procedure. Domain experts can model semantic content without support from ontology engineers by using Wikidata, lowering entry barriers and increasing cognitive interoperability. The second level involves mapping Wikidata terms to established ontologies, and the third step developing semantic graph patterns for reasoning, requiring collaboration with ontology engineers.
DLJan 30, 2019
Open Research Knowledge Graph: Next Generation Infrastructure for Semantic Scholarly KnowledgeMohamad Yaser Jaradeh, Allard Oelen, Kheir Eddine Farfar et al.
Despite improved digital access to scholarly knowledge in recent decades, scholarly communication remains exclusively document-based. In this form, scholarly knowledge is hard to process automatically. In this paper, we present the first steps towards a knowledge graph based infrastructure that acquires scholarly knowledge in machine actionable form thus enabling new possibilities for scholarly knowledge curation, publication and processing. The primary contribution is to present, evaluate and discuss multi-modal scholarly knowledge acquisition, combining crowdsourced and automated techniques. We present the results of the first user evaluation of the infrastructure with the participants of a recent international conference. Results suggest that users were intrigued by the novelty of the proposed infrastructure and by the possibilities for innovative scholarly knowledge processing it could enable.