Biomedical Data-to-Text Generation via Fine-Tuning Transformers
This work addresses data-to-text generation for biomedical applications, but it is incremental as it applies existing methods to a new domain.
The authors tackled biomedical data-to-text generation by fine-tuning transformers on a real-world dataset of European medicine package leaflets, showing they can generate realistic multisentence text but with limitations, and released a new dataset called BioLeaflets for benchmarking.
Data-to-text (D2T) generation in the biomedical domain is a promising - yet mostly unexplored - field of research. Here, we apply neural models for D2T generation to a real-world dataset consisting of package leaflets of European medicines. We show that fine-tuned transformers are able to generate realistic, multisentence text from data in the biomedical domain, yet have important limitations. We also release a new dataset (BioLeaflets) for benchmarking D2T generation models in the biomedical domain.