CLJun 24, 2025

heiDS at ArchEHR-QA 2025: From Fixed-k to Query-dependent-k for Retrieval Augmented Generation

arXiv:2506.19512v11 citationsh-index: 4Proceedings of the 24th Workshop on Biomedical Language Processing (Shared Tasks)
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of improving answer quality in clinical question-answering systems, though it appears incremental as it builds on existing RAG frameworks with new truncation methods.

The paper tackled the problem of generating accurate answers from electronic health records by proposing a query-dependent-k retrieval strategy for retrieval augmented generation, showing benefits over fixed-k methods in producing factual and relevant answers.

This paper presents the approach of our team called heiDS for the ArchEHR-QA 2025 shared task. A pipeline using a retrieval augmented generation (RAG) framework is designed to generate answers that are attributed to clinical evidence from the electronic health records (EHRs) of patients in response to patient-specific questions. We explored various components of a RAG framework, focusing on ranked list truncation (RLT) retrieval strategies and attribution approaches. Instead of using a fixed top-k RLT retrieval strategy, we employ a query-dependent-k retrieval strategy, including the existing surprise and autocut methods and two new methods proposed in this work, autocut* and elbow. The experimental results show the benefits of our strategy in producing factual and relevant answers when compared to a fixed-$k$.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes