Constructing Information-Lossless Biological Knowledge Graphs from Conditional Statements
This work addresses the need for more accurate knowledge graphs in biology by handling conditional statements, though it appears incremental as it builds on existing extraction methods with specific enhancements.
The paper tackled the problem of extracting structured information from biological literature by considering conditions and attributes, which existing methods ignore, and demonstrated that their method yields an information-lossless structure.
Conditions are essential in the statements of biological literature. Without the conditions (e.g., environment, equipment) that were precisely specified, the facts (e.g., observations) in the statements may no longer be valid. One biological statement has one or multiple fact(s) and/or condition(s). Their subject and object can be either a concept or a concept's attribute. Existing information extraction methods do not consider the role of condition in the biological statement nor the role of attribute in the subject/object. In this work, we design a new tag schema and propose a deep sequence tagging framework to structure conditional statement into fact and condition tuples from biological text. Experiments demonstrate that our method yields a information-lossless structure of the literature.