IRCLFeb 1, 2022

Semantic Annotation and Querying Framework based on Semi-structured Ayurvedic Text

arXiv:2202.00216v1292 citations
Originality Synthesis-oriented
AI Analysis

This work provides a manually curated knowledge graph for Sanskrit Ayurvedic texts, which is incremental as it adapts existing tools to address a domain-specific bottleneck in NLP for low-resource languages.

The authors tackled the lack of automated knowledge base construction in Sanskrit NLP by manually annotating an Ayurvedic text to create a knowledge graph with 410 entities and 764 relationships, and they developed an ontology and query templates to enable semantic querying.

Knowledge bases (KB) are an important resource in a number of natural language processing (NLP) and information retrieval (IR) tasks, such as semantic search, automated question-answering etc. They are also useful for researchers trying to gain information from a text. Unfortunately, however, the state-of-the-art in Sanskrit NLP does not yet allow automated construction of knowledge bases due to unavailability or lack of sufficient accuracy of tools and methods. Thus, in this work, we describe our efforts on manual annotation of Sanskrit text for the purpose of knowledge graph (KG) creation. We choose the chapter Dhanyavarga from Bhavaprakashanighantu of the Ayurvedic text Bhavaprakasha for annotation. The constructed knowledge graph contains 410 entities and 764 relationships. Since Bhavaprakashanighantu is a technical glossary text that describes various properties of different substances, we develop an elaborate ontology to capture the semantics of the entity and relationship types present in the text. To query the knowledge graph, we design 31 query templates that cover most of the common question patterns. For both manual annotation and querying, we customize the Sangrahaka framework previously developed by us. The entire system including the dataset is available from https://sanskrit.iitk.ac.in/ayurveda/ . We hope that the knowledge graph that we have created through manual annotation and subsequent curation will help in development and testing of NLP tools in future as well as studying of the Bhavaprakasanighantu text.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes