CL DL IR OTAug 19, 2025

Scalable Scientific Interest Profiling Using Large Language Models

Yilun Liang, Gongbo Zhang, Edward Sun, Betina Idnay, Yilu Fang, Fangyi Chen, Casey Ta, Yifan Peng, Chunhua Weng

arXiv:2508.15834v12.71 citationsh-index: 15Journal of Biomedical Informatics

Originality Synthesis-oriented

AI Analysis

This work addresses the need for scalable and up-to-date researcher profiles for scientists and institutions, but it is incremental as it applies existing LLM methods to a specific domain.

The study tackled the problem of outdated scientific interest profiles by developing two large language model-based methods to generate profiles from PubMed abstracts and MeSH terms, comparing them to self-written profiles; results showed that MeSH-based profiles were preferred in 67.86% of comparisons and rated good or excellent in 77.78% of cases, with moderate semantic similarity (BERTScore F1 around 0.55) but low lexical overlap.

Research profiles help surface scientists' expertise but are often outdated. We develop and evaluate two large language model-based methods to generate scientific interest profiles: one summarizing PubMed abstracts and one using Medical Subject Headings (MeSH) terms, and compare them with researchers' self-written profiles. We assembled titles, MeSH terms, and abstracts for 595 faculty at Columbia University Irving Medical Center; self-authored profiles were available for 167. Using GPT-4o-mini, we generated profiles and assessed them with automatic metrics and blinded human review. Lexical overlap with self-written profiles was low (ROUGE-L, BLEU, METEOR), while BERTScore indicated moderate semantic similarity (F1: 0.542 for MeSH-based; 0.555 for abstract-based). Paraphrased references yielded 0.851, highlighting metric sensitivity. TF-IDF Kullback-Leibler divergence (8.56 for MeSH-based; 8.58 for abstract-based) suggested distinct keyword choices. In manual review, 77.78 percent of MeSH-based profiles were rated good or excellent, readability was favored in 93.44 percent of cases, and panelists preferred MeSH-based over abstract-based profiles in 67.86 percent of comparisons. Overall, large language models can generate researcher profiles at scale; MeSH-derived profiles tend to be more readable than abstract-derived ones. Machine-generated and self-written profiles differ conceptually, with human summaries introducing more novel ideas.

View on arXiv PDF

Similar