Tracking Short-Term Temporal Linguistic Dynamics to Characterize Candidate Therapeutics for COVID-19 in the CORD-19 Corpus
This study addresses the problem of early pre-screening of new candidate therapeutics for researchers by analyzing scientific literature trends, offering an incremental approach.
The authors analyzed the CORD-19 corpus to track the temporal linguistic dynamics of candidate COVID-19 therapeutics. They investigated if changes associated with these therapeutics could be found and measured over time within the growing scientific literature.
Scientific literature tends to grow as a function of funding and interest in a given field. Mining such literature can reveal trends that may not be immediately apparent. The CORD-19 corpus represents a growing corpus of scientific literature associated with COVID-19. We examined the intersection of a set of candidate therapeutics identified in a drug-repurposing study with temporal instances of the CORD-19 corpus to determine if it was possible to find and measure changes associated with them over time. We propose that the techniques we used could form the basis of a tool to pre-screen new candidate therapeutics early in the research process.