CLAug 6, 2019

Clustering of Deep Contextualized Representations for Summarization of Biomedical Texts

arXiv:1908.02286v20.39 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of resource-intensive domain knowledge construction for biomedical text summarization, offering a more efficient alternative.

The paper tackles biomedical text summarization by using BERT's contextualized representations to measure sentence similarity and quantify informative content, resulting in improved performance without domain knowledge.

In recent years, summarizers that incorporate domain knowledge into the process of text summarization have outperformed generic methods, especially for summarization of biomedical texts. However, construction and maintenance of domain knowledge bases are resource-intense tasks requiring significant manual annotation. In this paper, we demonstrate that contextualized representations extracted from the pre-trained deep language model BERT, can be effectively used to measure the similarity between sentences and to quantify the informative content. The results show that our BERT-based summarizer can improve the performance of biomedical summarization. Although the summarizer does not use any sources of domain knowledge, it can capture the context of sentences more accurately than the comparison methods. The source code and data are available at https://github.com/BioTextSumm/BERT-based-Summ.

View on arXiv PDF Code

Similar