William Hersh

h-index66

7papers

231citations

Novelty18%

AI Score32

Ranked #128,351 of 194,257 authors (top 66%)#1,329 in IR (top 61%)

7 Papers

16.8IRNov 30, 2023

Search Still Matters: Information Retrieval in the Era of Generative AI

William R. Hersh

Objective: Information retrieval (IR, also known as search) systems are ubiquitous in modern times. How does the emergence of generative artificial intelligence (AI), based on large language models (LLMs), fit into the IR process? Process: This perspective explores the use of generative AI in the context of the motivations, considerations, and outcomes of the IR process with a focus on the academic use of such systems. Conclusions: There are many information needs, from simple to complex, that motivate use of IR. Users of such systems, particularly academics, have concerns for authoritativeness, timeliness, and contextualization of search. While LLMs may provide functionality that aids the IR process, the continued need for search systems, and research into their improvement, remains essential.

0.6CLDec 28, 2025

Clinical Document Metadata Extraction: A Scoping Review

Kurt Miller, Qiuhao Lu, William Hersh et al.

Clinical document metadata, such as document type, structure, author role, medical specialty, and encounter setting, is essential for accurate interpretation of information captured in clinical documents. However, vast documentation heterogeneity and drift over time challenge harmonization of document metadata. Automated extraction methods have emerged to coalesce metadata from disparate practices into target schema. This scoping review aims to catalog research on clinical document metadata extraction, identify methodological trends and applications, and highlight gaps. We followed the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) guidelines to identify articles that perform clinical document metadata extraction. We initially found and screened 266 articles published between January 2011 and August 2025, then comprehensively reviewed 67 we deemed relevant to our study. Among the articles included, 45 were methodological, 17 used document metadata as features in a downstream application, and 5 analyzed document metadata composition. We observe myriad purposes for methodological study and application types. Available labelled public data remains sparse except for structural section datasets. Methods for extracting document metadata have progressed from largely rule-based and traditional machine learning with ample feature engineering to transformer-based architectures with minimal feature engineering. The emergence of large language models has enabled broader exploration of generalizability across tasks and datasets, allowing the possibility of advanced clinical text processing systems. We anticipate that research will continue to expand into richer document metadata representations and integrate further into clinical applications and workflows.

3.3AIJan 17, 2025

Generative Artificial Intelligence: Implications for Biomedical and Health Professions Education

William Hersh

Generative AI has had a profound impact on biomedicine and health, both in professional work and in education. Based on large language models (LLMs), generative AI has been found to perform as well as humans in simulated situations taking medical board exams, answering clinical questions, solving clinical cases, applying clinical reasoning, and summarizing information. Generative AI is also being used widely in education, performing well in academic courses and their assessments. This review summarizes the successes of LLMs and highlights some of their challenges in the context of education, most notably aspects that may undermines the acquisition of knowledge and skills for professional work. It then provides recommendations for best practices overcoming shortcomings for LLM use in education. Although there are challenges for use of generative AI in education, all students and faculty, in biomedicine and health and beyond, must have understanding and be competent in its use.

1.2CYMay 20, 2025

Bridge2AI: Building A Cross-disciplinary Curriculum Towards AI-Enhanced Biomedical and Clinical Care

John Rincon, Alexander R. Pelletier, Destiny Gilliland et al.

Objective: As AI becomes increasingly central to healthcare, there is a pressing need for bioinformatics and biomedical training systems that are personalized and adaptable. Materials and Methods: The NIH Bridge2AI Training, Recruitment, and Mentoring (TRM) Working Group developed a cross-disciplinary curriculum grounded in collaborative innovation, ethical data stewardship, and professional development within an adapted Learning Health System (LHS) framework. Results: The curriculum integrates foundational AI modules, real-world projects, and a structured mentee-mentor network spanning Bridge2AI Grand Challenges and the Bridge Center. Guided by six learner personas, the program tailors educational pathways to individual needs while supporting scalability. Discussion: Iterative refinement driven by continuous feedback ensures that content remains responsive to learner progress and emerging trends. Conclusion: With over 30 scholars and 100 mentors engaged across North America, the TRM model demonstrates how adaptive, persona-informed training can build interdisciplinary competencies and foster an integrative, ethically grounded AI education in biomedical contexts.

11.2IRApr 19, 2021

Searching for Scientific Evidence in a Pandemic: An Overview of TREC-COVID

Kirk Roberts, Tasmeer Alam, Steven Bedrick et al.

We present an overview of the TREC-COVID Challenge, an information retrieval (IR) shared task to evaluate search on scientific literature related to COVID-19. The goals of TREC-COVID include the construction of a pandemic search test collection and the evaluation of IR methods for COVID-19. The challenge was conducted over five rounds from April to July, 2020, with participation from 92 unique teams and 556 individual submissions. A total of 50 topics (sets of related queries) were used in the evaluation, starting at 30 topics for Round 1 and adding 5 new topics per round to target emerging topics at that state of the still-emerging pandemic. This paper provides a comprehensive overview of the structure and results of TREC-COVID. Specifically, the paper provides details on the background, task structure, topic structure, corpus, participation, pooling, assessment, judgments, results, top-performing systems, lessons learned, and benchmark datasets.

32.5IRMay 9, 2020

TREC-COVID: Constructing a Pandemic Information Retrieval Test Collection

Ellen Voorhees, Tasmeer Alam, Steven Bedrick et al.

TREC-COVID is a community evaluation designed to build a test collection that captures the information needs of biomedical researchers using the scientific literature during a pandemic. One of the key characteristics of pandemic search is the accelerated rate of change: the topics of interest evolve as the pandemic progresses and the scientific literature in the area explodes. The COVID-19 pandemic provides an opportunity to capture this progression as it happens. TREC-COVID, in creating a test collection around COVID-19 literature, is building infrastructure to support new research and technologies in pandemic search.

6.6IRJan 22, 2019Code

CREATE: Cohort Retrieval Enhanced by Analysis of Text from Electronic Health Records using OMOP Common Data Model

Sijia Liu, Yanshan Wang, Andrew Wen et al.

Background: Widespread adoption of electronic health records (EHRs) has enabled secondary use of EHR data for clinical research and healthcare delivery. Natural language processing (NLP) techniques have shown promise in their capability to extract the embedded information in unstructured clinical data, and information retrieval (IR) techniques provide flexible and scalable solutions that can augment the NLP systems for retrieving and ranking relevant records. Methods: In this paper, we present the implementation of Cohort Retrieval Enhanced by Analysis of Text from EHRs (CREATE), a cohort retrieval system that can execute textual cohort selection queries on both structured and unstructured EHR data. CREATE is a proof-of-concept system that leverages a combination of structured queries and IR techniques on NLP results to improve cohort retrieval performance while adopting the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) to enhance model portability. The NLP component empowered by cTAKES is used to extract CDM concepts from textual queries. We design a hierarchical index in Elasticsearch to support CDM concept search utilizing IR techniques and frameworks. Results: Our case study on 5 cohort identification queries evaluated using the IR metric, P@5 (Precision at 5) at both the patient-level and document-level, demonstrates that CREATE achieves an average P@5 of 0.90, which outperforms systems using only structured data or only unstructured data with average P@5s of 0.54 and 0.74, respectively.