Iterative Prompt Refinement for Radiation Oncology Symptom Extraction Using Teacher-Student Large Language Models
It addresses symptom extraction for radiation oncology, offering a novel method but is incremental in applying existing LLMs to a specific medical domain.
This study tackled the problem of extracting prostate cancer radiotherapy symptoms from clinical notes by introducing a teacher-student LLM architecture with iterative prompt refinement, resulting in significant improvements such as F1 scores increasing from 0.49 to 0.73 for single-symptom notes and from 0.20 to 0.44 for multi-symptom notes.
This study introduces a novel teacher-student architecture utilizing Large Language Models (LLMs) to improve prostate cancer radiotherapy symptom extraction from clinical notes. Mixtral, the student model, initially extracts symptoms, followed by GPT-4, the teacher model, which refines prompts based on Mixtral's performance. This iterative process involved 294 single symptom clinical notes across 12 symptoms, with up to 16 rounds of refinement per epoch. Results showed significant improvements in extracting symptoms from both single and multi-symptom notes. For 59 single symptom notes, accuracy increased from 0.51 to 0.71, precision from 0.52 to 0.82, recall from 0.52 to 0.72, and F1 score from 0.49 to 0.73. In 375 multi-symptom notes, accuracy rose from 0.24 to 0.43, precision from 0.6 to 0.76, recall from 0.24 to 0.43, and F1 score from 0.20 to 0.44. These results demonstrate the effectiveness of advanced prompt engineering in LLMs for radiation oncology use.