LGAIMar 26, 2025

Guided Model Merging for Hybrid Data Learning: Leveraging Centralized Data to Refine Decentralized Models

arXiv:2503.20138v2h-index: 7
AI Analysis

This addresses a practical problem in machine learning where data availability is often hybrid, offering a robust solution for scenarios with domain mismatches or noise.

The paper tackles the challenge of training models with hybrid data availability (both centralized and decentralized data) by proposing a framework that constructs a model atlas from decentralized models and refines it using centralized data, achieving faster convergence and outperforming existing methods in experiments.

Current network training paradigms primarily focus on either centralized or decentralized data regimes. However, in practice, data availability often exhibits a hybrid nature, where both regimes coexist. This hybrid setting presents new opportunities for model training, as the two regimes offer complementary trade-offs: decentralized data is abundant but subject to heterogeneity and communication constraints, while centralized data, though limited in volume and potentially unrepresentative, enables better curation and high-throughput access. Despite its potential, effectively combining these paradigms remains challenging, and few frameworks are tailored to hybrid data regimes. To address this, we propose a novel framework that constructs a model atlas from decentralized models and leverages centralized data to refine a global model within this structured space. The refined model is then used to reinitialize the decentralized models. Our method synergizes federated learning (to exploit decentralized data) and model merging (to utilize centralized data), enabling effective training under hybrid data availability. Theoretically, we show that our approach achieves faster convergence than methods relying solely on decentralized data, due to variance reduction in the merging process. Extensive experiments demonstrate that our framework consistently outperforms purely centralized, purely decentralized, and existing hybrid-adaptable methods. Notably, our method remains robust even when the centralized and decentralized data domains differ or when decentralized data contains noise, significantly broadening its applicability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes