IR DLApr 9, 2016

On the Composition of Scientific Abstracts

Iana Atanassova, Marc Bertin, Vincent Larivière

arXiv:1604.02580v14.833 citations

Originality Synthesis-oriented

AI Analysis

This provides insights into abstract writing practices for researchers and publishers, but it is incremental as it builds on existing knowledge of text similarity.

The paper tackled the problem of understanding the composition of scientific abstracts by analyzing text re-use between abstracts and article bodies, finding that 84% of abstracts share at least one sentence with the article and that sentences primarily come from the introduction and conclusion sections.

Scientific abstracts contain what is considered by the author(s) as information that best describe documents' content. They represent a compressed view of the informational content of a document and allow readers to evaluate the relevance of the document to a particular information need. However, little is known on their composition. This paper contributes to the understanding of the structure of abstracts, by comparing similarity between scientific abstracts and the text content of research articles. More specifically, using sentence-based similarity metrics, we quantify the phenomenon of text re-use in abstracts and examine the positions of the sentences that are similar to sentences in abstracts in the IMRaD structure (Introduction, Methods, Results and Discussion), using a corpus of over 85,000 research articles published in the seven PLOS journals. We provide evidence that 84% of abstract have at least one sentence in common with the body of the article. Our results also show that the sections of the paper from which abstract sentence are taken are invariant across the PLOS journals, with sentences mainly coming from the beginning of the introduction and the end of the conclusion.

View on arXiv PDF

Similar