LGMay 29, 2025

Position: Federated Foundation Language Model Post-Training Should Focus on Open-Source Models

Nikita Agrawal, Simon Mertel, Ruben Mayer

arXiv:2505.23593v24.1h-index: 1

Originality Synthesis-oriented

AI Analysis

This addresses privacy and autonomy concerns in federated learning for researchers and practitioners, but is incremental as it critiques existing approaches without new empirical results.

The paper argues that using black-box foundation language models in federated learning post-training contradicts privacy and autonomy principles, advocating instead for open-source models to better align with federated learning goals.

Post-training of foundation language models has emerged as a promising research domain in federated learning (FL) with the goal to enable privacy-preserving model improvements and adaptations to user's downstream tasks. Recent advances in this area adopt centralized post-training approaches that build upon black-box foundation language models where there is no access to model weights and architecture details. Although the use of black-box models has been successful in centralized post-training, their blind replication in FL raises several concerns. Our position is that using black-box models in FL contradicts the core principles of federation such as data privacy and autonomy. In this position paper, we critically analyze the usage of black-box models in federated post-training, and provide a detailed account of various aspects of openness and their implications for FL.

View on arXiv PDF

Similar