CLAIFeb 25, 2021

Spanish Biomedical and Clinical Language Embeddings

arXiv:2102.12843v13 citations
Originality Synthesis-oriented
AI Analysis

This work provides improved embeddings for Spanish biomedical and clinical text, which is an incremental advancement for natural language processing in this domain.

The researchers computed Spanish biomedical and clinical language embeddings using FastText with Byte Pair Encoding for sub-word representations, and found that their biomedical word embeddings outperformed previous versions, demonstrating that more data leads to better representations.

We computed both Word and Sub-word Embeddings using FastText. For Sub-word embeddings we selected Byte Pair Encoding (BPE) algorithm to represent the sub-words. We evaluated the Biomedical Word Embeddings obtaining better results than previous versions showing the implication that with more data, we obtain better representations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes