Biomedical Knowledge Graph Refinement with Embedding and Logic Rules
This work aims to improve the quality of biomedical knowledge graphs, particularly relevant for researchers and practitioners relying on precise biomedical knowledge, such as in the context of COVID-19.
This paper addresses the problem of conflicts and noise in biomedical knowledge graphs (BioKG) by proposing BioGRER, a method that combines knowledge graph embedding and logic rules. The method formulates BioKG refinement as a probability estimation for triplets and uses a variational EM algorithm for optimization, achieving competitive results on a COVID-19 knowledge graph.
Currently, there is a rapidly increasing need for high-quality biomedical knowledge graphs (BioKG) that provide direct and precise biomedical knowledge. In the context of COVID-19, this issue is even more necessary to be highlighted. However, most BioKG construction inevitably includes numerous conflicts and noises deriving from incorrect knowledge descriptions in literature and defective information extraction techniques. Many studies have demonstrated that reasoning upon the knowledge graph is effective in eliminating such conflicts and noises. This paper proposes a method BioGRER to improve the BioKG's quality, which comprehensively combines the knowledge graph embedding and logic rules that support and negate triplets in the BioKG. In the proposed model, the BioKG refinement problem is formulated as the probability estimation for triplets in the BioKG. We employ the variational EM algorithm to optimize knowledge graph embedding and logic rule inference alternately. In this way, our model could combine efforts from both the knowledge graph embedding and logic rules, leading to better results than using them alone. We evaluate our model over a COVID-19 knowledge graph and obtain competitive results.