HCDCLGJun 4, 2025

Crowd-SFT: Crowdsourcing for LLM Alignment

arXiv:2506.04063v11 citationsh-index: 6DAPPCON
Originality Incremental advance
AI Analysis

This addresses scalability and fairness issues in LLM alignment for AI developers, though it is incremental as it builds on existing SFT and RLHF methods.

The paper tackles the problem of costly and biased alignment of Large Language Models by proposing a crowd-sourced fine-tuning framework, which reduces target distance by up to 55% and aligns reward mechanisms with Shapley values for fairness.

Large Language Models (LLMs) increasingly rely on Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) to align model responses with human preferences. While RLHF employs a reinforcement learning approach with a separate reward model, SFT uses human-curated datasets for supervised learning. Both approaches traditionally depend on small, vetted groups of annotators, making them costly, prone to bias, and limited in scalability. We propose an open, crowd-sourced fine-tuning framework that addresses these limitations by enabling broader feedback collection for SFT without extensive annotator training. Our framework promotes incentive fairness via a point-based reward system correlated with Shapley values and guides model convergence through iterative model updates. Our multi-model selection framework demonstrates up to a 55% reduction in target distance over single-model selection, enabling subsequent experiments that validate our point-based reward mechanism's close alignment with Shapley values (a well-established method for attributing individual contributions) thereby supporting fair and scalable participation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes