HC DC LGJun 4, 2025

Crowd-SFT: Crowdsourcing for LLM Alignment

Alex Sotiropoulos, Sulyab Thottungal Valapu, Linus Lei, Jared Coleman, Bhaskar Krishnamachari

arXiv:2506.04063v14.11 citationsh-index: 6DAPPCON

Originality Incremental advance

AI Analysis

This addresses scalability and fairness issues in LLM alignment for AI developers, though it is incremental as it builds on existing SFT and RLHF methods.

The paper tackles the problem of costly and biased alignment of Large Language Models by proposing a crowd-sourced fine-tuning framework, which reduces target distance by up to 55% and aligns reward mechanisms with Shapley values for fairness.

Large Language Models (LLMs) increasingly rely on Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) to align model responses with human preferences. While RLHF employs a reinforcement learning approach with a separate reward model, SFT uses human-curated datasets for supervised learning. Both approaches traditionally depend on small, vetted groups of annotators, making them costly, prone to bias, and limited in scalability. We propose an open, crowd-sourced fine-tuning framework that addresses these limitations by enabling broader feedback collection for SFT without extensive annotator training. Our framework promotes incentive fairness via a point-based reward system correlated with Shapley values and guides model convergence through iterative model updates. Our multi-model selection framework demonstrates up to a 55% reduction in target distance over single-model selection, enabling subsequent experiments that validate our point-based reward mechanism's close alignment with Shapley values (a well-established method for attributing individual contributions) thereby supporting fair and scalable participation.

View on arXiv PDF

Similar