CLOct 18, 2024

MediTOD: An English Dialogue Dataset for Medical History Taking with Comprehensive Annotations

IBM
arXiv:2410.14204v134 citationsh-index: 12EMNLP
Originality Synthesis-oriented
AI Analysis

This addresses a data scarcity problem for researchers and developers building medical task-oriented dialogue systems, though it is incremental as it focuses on dataset creation rather than novel methods.

The authors tackled the lack of publicly available, comprehensively annotated English dialogue datasets for medical history-taking by introducing MediTOD, a new dataset with high-quality annotations created by medical professionals, and established benchmarks showing competitive performance in supervised and few-shot settings.

Medical task-oriented dialogue systems can assist doctors by collecting patient medical history, aiding in diagnosis, or guiding treatment selection, thereby reducing doctor burnout and expanding access to medical services. However, doctor-patient dialogue datasets are not readily available, primarily due to privacy regulations. Moreover, existing datasets lack comprehensive annotations involving medical slots and their different attributes, such as symptoms and their onset, progression, and severity. These comprehensive annotations are crucial for accurate diagnosis. Finally, most existing datasets are non-English, limiting their utility for the larger research community. In response, we introduce MediTOD, a new dataset of doctor-patient dialogues in English for the medical history-taking task. Collaborating with doctors, we devise a questionnaire-based labeling scheme tailored to the medical domain. Then, medical professionals create the dataset with high-quality comprehensive annotations, capturing medical slots and their attributes. We establish benchmarks in supervised and few-shot settings on MediTOD for natural language understanding, policy learning, and natural language generation subtasks, evaluating models from both TOD and biomedical domains. We make MediTOD publicly available for future research.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes