LGAICLDec 9, 2022

MED-SE: Medical Entity Definition-based Sentence Embedding

arXiv:2212.04734v1h-index: 11
Originality Incremental advance
AI Analysis

This work addresses the challenge of representing clinical sentences accurately for medical professionals, though it appears incremental as it builds on contrastive learning with a domain-specific twist.

The authors tackled the problem of generating sentence embeddings for clinical texts by proposing MED-SE, an unsupervised contrastive learning framework that uses medical entity definitions, achieving significantly better performance in entity-centric semantic textual similarity settings compared to existing methods like SimCSE.

We propose Medical Entity Definition-based Sentence Embedding (MED-SE), a novel unsupervised contrastive learning framework designed for clinical texts, which exploits the definitions of medical entities. To this end, we conduct an extensive analysis of multiple sentence embedding techniques in clinical semantic textual similarity (STS) settings. In the entity-centric setting that we have designed, MED-SE achieves significantly better performance, while the existing unsupervised methods including SimCSE show degraded performance. Our experiments elucidate the inherent discrepancies between the general- and clinical-domain texts, and suggest that entity-centric contrastive approaches may help bridge this gap and lead to a better representation of clinical sentences.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes