DCCLDLIRJun 13, 2024

A Document-based Knowledge Discovery with Microservices Architecture

arXiv:2407.00053v11 citations
Originality Synthesis-oriented
AI Analysis

This addresses the problem of 'data-rich but knowledge-poor' material in organizations undergoing digitalization, though it appears incremental as it builds on existing knowledge discovery approaches with a specific architectural design.

The paper tackles the challenge of converting digitized data into useful knowledge by proposing a microservices-based architecture for knowledge discovery, which includes features like keyword extraction and natural language database queries, and has been implemented in a demonstrator and extended for use at the German patent office.

The first step towards digitalization within organizations lies in digitization - the conversion of analog data into digitally stored data. This basic step is the prerequisite for all following activities like the digitalization of processes or the servitization of products or offerings. However, digitization itself often leads to 'data-rich' but 'knowledge-poor' material. Knowledge discovery and knowledge extraction as approaches try to increase the usefulness of digitized data. In this paper, we point out the key challenges in the context of knowledge discovery and present an approach to addressing these using a microservices architecture. Our solution led to a conceptual design focusing on keyword extraction, similarity calculation of documents, database queries in natural language, and programming language independent provision of the extracted information. In addition, the conceptual design provides referential design guidelines for integrating processes and applications for semi-automatic learning, editing, and visualization of ontologies. The concept also uses a microservices architecture to address non-functional requirements, such as scalability and resilience. The evaluation of the specified requirements is performed using a demonstrator that implements the concept. Furthermore, this modern approach is used in the German patent office in an extended version.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes