Complexity-aware fine-tuning
This work addresses the challenge of reducing data and computational costs for fine-tuning LLMs in specific domains, representing an incremental improvement over existing methods.
The paper tackles the problem of efficiently fine-tuning large language models by proposing a complexity-aware method that splits training data based on entropy, achieving an average accuracy of 0.58 compared to 0.45 for standard SFT and 0.56 for distillation, while using 81% less data.
General-purpose Large Language Models (LLMs) are frequently fine-tuned through supervised fine-tuning (SFT) to enhance performance in specific domains. Better results can be achieved by distilling the chain-of-thought of a larger model at the cost of numerous expensive calls and a much greater amount of data. We propose a novel blueprint for efficient fine-tuning that uses reasoning only for complex data identified by entropy. Specifically, across two small open models ($~3B$) we split the training data into complexity categories by a single token answer entropy (ROC AUC $0.73$), fine-tune large language models (LLMs) via SFT and distillation, and show that our pipeline significantly outperforms the standard SFT approach ($0.58$ vs $0.45$ average accuracy) and outperforms the distillation approach ($0.58$ vs $0.56$ average accuracy) while using $81%$ less data.