CLApr 19, 2022
Where Was COVID-19 First Discovered? Designing a Question-Answering System for Pandemic SituationsJohannes Graf, Gino Lancho, Patrick Zschech et al.
The COVID-19 pandemic is accompanied by a massive "infodemic" that makes it hard to identify concise and credible information for COVID-19-related questions, like incubation time, infection rates, or the effectiveness of vaccines. As a novel solution, our paper is concerned with designing a question-answering system based on modern technologies from natural language processing to overcome information overload and misinformation in pandemic situations. To carry out our research, we followed a design science research approach and applied Ingwersen's cognitive model of information retrieval interaction to inform our design process from a socio-technical lens. On this basis, we derived prescriptive design knowledge in terms of design requirements and design principles, which we translated into the construction of a prototypical instantiation. Our implementation is based on the comprehensive CORD-19 dataset, and we demonstrate our artifact's usefulness by evaluating its answer quality based on a sample of COVID-19 questions labeled by biomedical experts.
12.6DBApr 29
Graphify: Automated Synthesis of Type-Safe Graph Backends via $O(S)$ GraphQL-to-Gremlin TranspilationJohannes Graf
Graph databases offer unparalleled flexibility for managing interconnected data, yet the lack of strict schema enforcement often leads to runtime uncertainties and complex query development. This paper introduces Graphify, an end-to-end framework that enables developers to visually model graph data schemas and automatically synthesize a fully functional, type-safe backend. This paper proposes a formal mapping of GraphQL artifacts to the Gremlin traversal machine, supporting complete CRUD operations and arbitrarily nested queries. The system generates a transpiler capable of converting complex GraphQL requests into a single, optimized Gremlin query, including advanced features such as nested logical predicates, multi-key sorting, and pagination. At the core of the framework is a recursive state machine algorithm that processes GraphQL Abstract Syntax Trees (ASTs) with linear time complexity $O(S)$ relative to the number of selected fields. This paper demonstrates the practical efficiency and theoretical robustness of the approach through formal complexity analysis and empirical evaluation using the MovieLens 100k dataset. The result is a system that enables the generation of graph interfaces in minutes, bridging the gap between flexible graph storage and type-safe API consumption.