CLJul 30, 2023

A Knowledge-enhanced Two-stage Generative Framework for Medical Dialogue Information Extraction

Zefa Hu, Ziyi Ni, Jing Shi, Shuang Xu, Bo Xu

arXiv:2307.16200v42.15 citationsh-index: 27Has Code

Originality Incremental advance

AI Analysis

This work addresses a domain-specific problem for medical diagnosis systems and EMR automation, offering an incremental improvement over existing generative methods.

The paper tackles term-status pair extraction from medical dialogues by proposing a knowledge-enhanced two-stage generative framework, achieving superior results on Chunyu and CMDD datasets compared to state-of-the-art models in full and low-resource settings.

This paper focuses on term-status pair extraction from medical dialogues (MD-TSPE), which is essential in diagnosis dialogue systems and the automatic scribe of electronic medical records (EMRs). In the past few years, works on MD-TSPE have attracted increasing research attention, especially after the remarkable progress made by generative methods. However, these generative methods output a whole sequence consisting of term-status pairs in one stage and ignore integrating prior knowledge, which demands a deeper understanding to model the relationship between terms and infer the status of each term. This paper presents a knowledge-enhanced two-stage generative framework (KTGF) to address the above challenges. Using task-specific prompts, we employ a single model to complete the MD-TSPE through two phases in a unified generative form: we generate all terms the first and then generate the status of each generated term. In this way, the relationship between terms can be learned more effectively from the sequence containing only terms in the first phase, and our designed knowledge-enhanced prompt in the second phase can leverage the category and status candidates of the generated term for status generation. Furthermore, our proposed special status "not mentioned" makes more terms available and enriches the training data in the second phase, which is critical in the low-resource setting. The experiments on the Chunyu and CMDD datasets show that the proposed method achieves superior results compared to the state-of-the-art models in the full training and low-resource settings.

View on arXiv PDF Code

Similar