CLJun 8, 2023

CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models

arXiv:2306.05317v1227 citationsh-index: 61
Originality Incremental advance
AI Analysis

This work addresses medical note summarization for healthcare professionals, representing an incremental improvement over existing methods.

The paper tackled summarizing medical progress notes with limited data by introducing a hierarchical ensemble of fine-tuned Clinical-T5 models, achieving a ROUGE-L score of 32.77 and topping the shared task leaderboard.

In this paper, we consider the challenge of summarizing patients' medical progress notes in a limited data setting. For the Problem List Summarization (shared task 1A) at the BioNLP Workshop 2023, we demonstrate that Clinical-T5 fine-tuned to 765 medical clinic notes outperforms other extractive, abstractive and zero-shot baselines, yielding reasonable baseline systems for medical note summarization. Further, we introduce Hierarchical Ensemble of Summarization Models (HESM), consisting of token-level ensembles of diverse fine-tuned Clinical-T5 models, followed by Minimum Bayes Risk (MBR) decoding. Our HESM approach lead to a considerable summarization performance boost, and when evaluated on held-out challenge data achieved a ROUGE-L of 32.77, which was the best-performing system at the top of the shared task leaderboard.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes