heiDS at ArchEHR-QA 2025: From Fixed-k to Query-dependent-k for Retrieval Augmented Generation
This work addresses the challenge of improving answer quality in clinical question-answering systems, though it appears incremental as it builds on existing RAG frameworks with new truncation methods.
The paper tackled the problem of generating accurate answers from electronic health records by proposing a query-dependent-k retrieval strategy for retrieval augmented generation, showing benefits over fixed-k methods in producing factual and relevant answers.
This paper presents the approach of our team called heiDS for the ArchEHR-QA 2025 shared task. A pipeline using a retrieval augmented generation (RAG) framework is designed to generate answers that are attributed to clinical evidence from the electronic health records (EHRs) of patients in response to patient-specific questions. We explored various components of a RAG framework, focusing on ranked list truncation (RLT) retrieval strategies and attribution approaches. Instead of using a fixed top-k RLT retrieval strategy, we employ a query-dependent-k retrieval strategy, including the existing surprise and autocut methods and two new methods proposed in this work, autocut* and elbow. The experimental results show the benefits of our strategy in producing factual and relevant answers when compared to a fixed-$k$.