CLAug 15, 2024

Evaluating Fine-Tuning Efficiency of Human-Inspired Learning Strategies in Medical Question Answering

arXiv:2408.07888v23 citationsh-index: 4
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of high training costs in fine-tuning LLMs for medical QA, offering incremental improvements in data efficiency.

The study evaluated five human-inspired data ordering strategies for fine-tuning LLMs in medical question answering, finding they achieved up to 1.81% accuracy gain with an average of 1.02%, but the best strategy varied by model and dataset, limiting generalizability.

Fine-tuning Large Language Models (LLMs) incurs considerable training costs, driving the need for data-efficient training with optimised data ordering. Human-inspired strategies offer a solution by organising data based on human learning practices. This study evaluates the fine-tuning efficiency of five human-inspired strategies across four language models, three datasets, and both human- and LLM-labelled data in the context of medical question answering. These strategies achieve the best accuracy gain of 1.81% and an average gain of 1.02% across datasets, with interleaved strategies delivering the best average results. However, the best strategy varies across model-dataset combinations, limiting the generalisability of the effects of any single strategy. Additionally, LLM-defined question difficulty outperforms human-defined labels in curriculum-based learning, showing the potential of model-generated data as a cost-effective alternative for optimising fine-tuning.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes