CL LGSep 29, 2025

Hype or not? Formalizing Automatic Promotional Language Detection in Biomedical Research

Bojan Batalo, Erica K. Shimomoto, Neil Millar

arXiv:2509.24638v12.7h-index: 7

Originality Incremental advance

AI Analysis

This addresses the issue of hype undermining scientific objectivity in biomedical research, representing a novel application of NLP to this domain.

The paper tackled the problem of detecting promotional language ('hype') in biomedical research by formalizing annotation guidelines and creating a dataset from NIH grant applications, showing that machine learning models trained on this dataset yield promising results for automatic detection.

In science, promotional language ('hype') is increasing and can undermine objective evaluation of evidence, impede research development, and erode trust in science. In this paper, we introduce the task of automatic detection of hype, which we define as hyperbolic or subjective language that authors use to glamorize, promote, embellish, or exaggerate aspects of their research. We propose formalized guidelines for identifying hype language and apply them to annotate a portion of the National Institutes of Health (NIH) grant application corpus. We then evaluate traditional text classifiers and language models on this task, comparing their performance with a human baseline. Our experiments show that formalizing annotation guidelines can help humans reliably annotate candidate hype adjectives and that using our annotated dataset to train machine learning models yields promising results. Our findings highlight the linguistic complexity of the task, and the potential need for domain knowledge and temporal awareness of the facts. While some linguistic works address hype detection, to the best of our knowledge, we are the first to approach it as a natural language processing task.

View on arXiv PDF

Similar