Inferring COVID-19 Biological Pathways from Clinical Phenotypes via Topological Analysis
This addresses the challenge for medical researchers in analyzing unstructured clinical data to understand COVID-19, though it appears incremental as it applies existing topological methods to a new domain.
The authors tackled the problem of identifying COVID-19 biological pathways from unstructured clinical notes by proposing a three-step pipeline using topological analysis, and demonstrated on a public dataset that it can extract meaningful pathways.
COVID-19 has caused thousands of deaths around the world and also resulted in a large international economic disruption. Identifying the pathways associated with this illness can help medical researchers to better understand the properties of the condition. This process can be carried out by analyzing the medical records. It is crucial to develop tools and models that can aid researchers with this process in a timely manner. However, medical records are often unstructured clinical notes, and this poses significant challenges to developing the automated systems. In this article, we propose a pipeline to aid practitioners in analyzing clinical notes and revealing the pathways associated with this disease. Our pipeline relies on topological properties and consists of three steps: 1) pre-processing the clinical notes to extract the salient concepts, 2) constructing a feature space of the patients to characterize the extracted concepts, and finally, 3) leveraging the topological properties to distill the available knowledge and visualize the result. Our experiments on a publicly available dataset of COVID-19 clinical notes testify that our pipeline can indeed extract meaningful pathways.