CLAug 27, 2025

Predicting Failures of LLMs to Link Biomedical Ontology Terms to Identifiers Evidence Across Models and Ontologies

Daniel B. Hier, Steven Keith Platt, Tayo Obafemi-Ajayi

arXiv:2509.04458v16.74 citationsh-index: 142025 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI)

Originality Synthesis-oriented

AI Analysis

This addresses a specific problem in biomedical NLP for researchers and practitioners, but it is incremental as it analyzes existing models and features without introducing new methods.

The study investigated why large language models fail to link biomedical ontology terms to correct identifiers, analyzing predictions across two ontologies and two models, and found that exposure to ontology identifiers was the strongest predictor of linking success.

Large language models often perform well on biomedical NLP tasks but may fail to link ontology terms to their correct identifiers. We investigate why these failures occur by analyzing predictions across two major ontologies, Human Phenotype Ontology and Gene Ontology, and two high-performing models, GPT-4o and LLaMa 3.1 405B. We evaluate nine candidate features related to term familiarity, identifier usage, morphology, and ontology structure. Univariate and multivariate analyses show that exposure to ontology identifiers is the strongest predictor of linking success.

View on arXiv PDF

Similar