GNIRBMTOMar 28, 2020

Knowledge synthesis from 100 million biomedical documents augments the deep expression profiling of coronavirus receptors

arXiv:2003.12773v155 citations
AI Analysis

This work addresses the urgent need for real-time biomedical data integration during the COVID-19 pandemic, offering a platform to accelerate biological insights, though it is incremental in applying existing methods to new data.

The researchers tackled the problem of synthesizing vast biomedical knowledge to understand COVID-19 mechanisms by developing the nferX platform, which analyzed over 45 quadrillion conceptual associations from text and single-cell RNA-sequencing data from 25 tissues, identifying under-appreciated infection targets like tongue keratinocytes and olfactory epithelial cells that correlate with early symptoms.

The COVID-19 pandemic demands assimilation of all available biomedical knowledge to decode its mechanisms of pathogenicity and transmission. Despite the recent renaissance in unsupervised neural networks for decoding unstructured natural languages, a platform for the real-time synthesis of the exponentially growing biomedical literature and its comprehensive triangulation with deep omic insights is not available. Here, we present the nferX platform for dynamic inference from over 45 quadrillion possible conceptual associations extracted from unstructured biomedical text, and their triangulation with Single Cell RNA-sequencing based insights from over 25 tissues. Using this platform, we identify intersections between the pathologic manifestations of COVID-19 and the comprehensive expression profile of the SARS-CoV-2 receptor ACE2. We find that tongue keratinocytes and olfactory epithelial cells are likely under-appreciated targets of SARS-CoV-2 infection, correlating with reported loss of sense of taste and smell as early indicators of COVID-19 infection, including in otherwise asymptomatic patients. Airway club cells, ciliated cells and type II pneumocytes in the lung, and enterocytes of the gut also express ACE2. This study demonstrates how a holistic data science platform can leverage unprecedented quantities of structured and unstructured publicly available data to accelerate the generation of impactful biological insights and hypotheses.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes