CL AI LGJan 30, 2025

R.I.P.: Better Models by Survival of the Fittest Prompts

Ping Yu, Weizhe Yuan, Olga Golovneva, Tianhao Wu, Sainbayar Sukhbaatar, Jason Weston, Jing Xu

arXiv:2501.18578v217.611 citationsh-index: 21ICML

Originality Incremental advance

AI Analysis

This addresses the issue of data integrity for improving model training in AI, representing an incremental advancement in data filtering techniques.

The paper tackles the problem of training data quality by introducing a method to filter low-quality prompts based on response variance and reward gaps, resulting in significant performance gains on benchmarks such as a 9.4% improvement in AlpacaEval2 LC Win Rate and moving from 18th to 6th place on Arena-Hard with Llama 3.3-70B-Instruct.

Training data quality is one of the most important drivers of final model quality. In this work, we introduce a method for evaluating data integrity based on the assumption that low-quality input prompts result in high variance and low quality responses. This is achieved by measuring the rejected response quality and the reward gap between the chosen and rejected preference pair. Our method, Rejecting Instruction Preferences (RIP) can be used to filter prompts from existing training sets, or to make high quality synthetic datasets, yielding large performance gains across various benchmarks compared to unfiltered data. Using Llama 3.1-8B-Instruct, RIP improves AlpacaEval2 LC Win Rate by 9.4%, Arena-Hard by 8.7%, and WildBench by 9.9%. Using Llama 3.3-70B-Instruct, RIP improves Arena-Hard from 67.5 to 82.9, which is from 18th place to 6th overall in the leaderboard.

View on arXiv PDF

Similar