Knowledge Graph Extraction from Biomedical Literature for Alkaptonuria Rare Disease
This work addresses the challenge of underrepresentation of rare diseases in biomedical knowledge graphs, aiding researchers and clinicians in understanding AKU, though it is incremental as it applies existing text-mining methods to a new disease domain.
The researchers tackled the problem of limited knowledge about Alkaptonuria (AKU), an ultra-rare disease, by constructing knowledge graphs from biomedical literature using a text-mining methodology based on PubTator3, which revealed systemic interactions, comorbidities, and potential therapeutic targets for the disease.
Alkaptonuria (AKU) is an ultra-rare autosomal recessive metabolic disorder caused by mutations in the HGD (Homogentisate 1,2-Dioxygenase) gene, leading to a pathological accumulation of homogentisic acid (HGA) in body fluids and tissues. This leads to systemic manifestations, including premature spondyloarthropathy, renal and prostatic stones, and cardiovascular complications. Being ultra-rare, the amount of data related to the disease is limited, both in terms of clinical data and literature. Knowledge graphs (KGs) can help connect the limited knowledge about the disease (basic mechanisms, manifestations and existing therapies) with other knowledge; however, AKU is frequently underrepresented or entirely absent in existing biomedical KGs. In this work, we apply a text-mining methodology based on PubTator3 for large-scale extraction of biomedical relations. We construct two KGs of different sizes, validate them using existing biochemical knowledge and use them to extract genes, diseases and therapies possibly related to AKU. This computational framework reveals the systemic interactions of the disease, its comorbidities, and potential therapeutic targets, demonstrating the efficacy of our approach in analyzing rare metabolic disorders.