CLIRJul 30, 2013

Extracting Connected Concepts from Biomedical Texts using Fog Index

arXiv:1307.8057v11 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of efficiently identifying relevant concept connections in biomedical texts, though it appears incremental as it builds on existing readability metrics and filtering techniques.

The authors tackled the problem of extracting connected biomedical concepts from texts by using the Fog Index to filter sentences with low readability, which often contain both concepts of interest, and then applied a second filter based on PPV and Sensitivity to remove unconnected pairs, achieving effective results as demonstrated experimentally.

In this paper, we establish Fog Index (FI) as a text filter to locate the sentences in texts that contain connected biomedical concepts of interest. To do so, we have used 24 random papers each containing four pairs of connected concepts. For each pair, we categorize sentences based on whether they contain both, any or none of the concepts. We then use FI to measure difficulty of the sentences of each category and find that sentences containing both of the concepts have low readability. We rank sentences of a text according to their FI and select 30 percent of the most difficult sentences. We use an association matrix to track the most frequent pairs of concepts in them. This matrix reports that the first filter produces some pairs that hold almost no connections. To remove these unwanted pairs, we use the Equally Weighted Harmonic Mean of their Positive Predictive Value (PPV) and Sensitivity as a second filter. Experimental results demonstrate the effectiveness of our method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes