CLOct 12, 2024

SciGisPy: a Novel Metric for Biomedical Text Simplification via Gist Inference Score

arXiv:2410.09632v122 citationsh-index: 2Proceedings of the Third Workshop on Text Simplification, Accessibility and Readability (TSAR 2024)
Originality Incremental advance
AI Analysis

This addresses the challenge of assessing text simplification for biomedical literature, making it more accessible to non-experts, though it is incremental as it builds on existing theory with domain-specific enhancements.

The authors tackled the problem of evaluating biomedical text simplification by introducing SciGisPy, a novel metric based on Gist Inference Score, which outperformed the original GIS with 84% versus 44.8% accuracy in identifying simplified texts.

Biomedical literature is often written in highly specialized language, posing significant comprehension challenges for non-experts. Automatic text simplification (ATS) offers a solution by making such texts more accessible while preserving critical information. However, evaluating ATS for biomedical texts is still challenging due to the limitations of existing evaluation metrics. General-domain metrics like SARI, BLEU, and ROUGE focus on surface-level text features, and readability metrics like FKGL and ARI fail to account for domain-specific terminology or assess how well the simplified text conveys core meanings (gist). To address this, we introduce SciGisPy, a novel evaluation metric inspired by Gist Inference Score (GIS) from Fuzzy-Trace Theory (FTT). SciGisPy measures how well a simplified text facilitates the formation of abstract inferences (gist) necessary for comprehension, especially in the biomedical domain. We revise GIS for this purpose by introducing domain-specific enhancements, including semantic chunking, Information Content (IC) theory, and specialized embeddings, while removing unsuitable indexes. Our experimental evaluation on the Cochrane biomedical text simplification dataset demonstrates that SciGisPy outperforms the original GIS formulation, with a significant increase in correctly identified simplified texts (84% versus 44.8%). The results and a thorough ablation study confirm that SciGisPy better captures the essential meaning of biomedical content, outperforming existing approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes