CLAILGJan 30, 2025

R.I.P.: Better Models by Survival of the Fittest Prompts

arXiv:2501.18578v211 citationsh-index: 21ICML
Originality Incremental advance
AI Analysis

This addresses the issue of data integrity for improving model training in AI, representing an incremental advancement in data filtering techniques.

The paper tackles the problem of training data quality by introducing a method to filter low-quality prompts based on response variance and reward gaps, resulting in significant performance gains on benchmarks such as a 9.4% improvement in AlpacaEval2 LC Win Rate and moving from 18th to 6th place on Arena-Hard with Llama 3.3-70B-Instruct.

Training data quality is one of the most important drivers of final model quality. In this work, we introduce a method for evaluating data integrity based on the assumption that low-quality input prompts result in high variance and low quality responses. This is achieved by measuring the rejected response quality and the reward gap between the chosen and rejected preference pair. Our method, Rejecting Instruction Preferences (RIP) can be used to filter prompts from existing training sets, or to make high quality synthetic datasets, yielding large performance gains across various benchmarks compared to unfiltered data. Using Llama 3.1-8B-Instruct, RIP improves AlpacaEval2 LC Win Rate by 9.4%, Arena-Hard by 8.7%, and WildBench by 9.9%. Using Llama 3.3-70B-Instruct, RIP improves Arena-Hard from 67.5 to 82.9, which is from 18th place to 6th overall in the leaderboard.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes