LGCLSep 3, 2021

Biomedical Data-to-Text Generation via Fine-Tuning Transformers

arXiv:2109.01518v1677 citations
Originality Synthesis-oriented
AI Analysis

This work addresses data-to-text generation for biomedical applications, but it is incremental as it applies existing methods to a new domain.

The authors tackled biomedical data-to-text generation by fine-tuning transformers on a real-world dataset of European medicine package leaflets, showing they can generate realistic multisentence text but with limitations, and released a new dataset called BioLeaflets for benchmarking.

Data-to-text (D2T) generation in the biomedical domain is a promising - yet mostly unexplored - field of research. Here, we apply neural models for D2T generation to a real-world dataset consisting of package leaflets of European medicines. We show that fine-tuned transformers are able to generate realistic, multisentence text from data in the biomedical domain, yet have important limitations. We also release a new dataset (BioLeaflets) for benchmarking D2T generation models in the biomedical domain.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes