Clinical Reading Comprehension with Encoder-Decoder Models Enhanced by Direct Preference Optimization
This work addresses the need to process clinical text in hospitals, but it is incremental as it applies existing DPO methods to a new domain with novel heuristics.
The paper tackles the problem of extractive question answering over clinical text by combining encoder-decoder models with direct preference optimization (DPO), resulting in a 12-15 F1 point improvement over prior state of the art on the RadQA radiology question answering task.
Extractive question answering over clinical text is a crucial need to help deal with the deluge of clinical text generated in hospitals. While encoder models (e.g., BERT) have been popular for this reading comprehension task, recently encoder-decoder models (e.g., T5) are on the rise. There is also the emergence of preference optimization techniques to align decoder-only LLMs with human preferences. In this paper, we combine encoder-decoder models with the direct preference optimization (DPO) method to improve over prior state of the art for the RadQA radiology question answering task by 12-15 F1 points. To the best of our knowledge, this effort is the first to show that DPO method also works for reading comprehension via novel heuristics to generate preference data without human inputs.