Expand Heterogeneous Learning Systems with Selective Multi-Source Knowledge Fusion
This work addresses the problem of customizing models for new users or domains in heterogeneous environments, representing an incremental improvement over existing knowledge distillation methods.
The paper tackles the challenge of expanding learning systems to new domains with limited labeled data and heterogeneity by proposing HaT, a framework that selects and fuses knowledge from multiple models, achieving up to 16.5% higher accuracy and 39% reduced communication traffic.
Expanding existing learning systems to provide high-quality customized models for more domains, such as new users, is challenged by the limited labeled data and the data and device heterogeneities. While knowledge distillation methods could overcome label scarcity and device heterogeneity, they assume the teachers are fully reliable and overlook the data heterogeneity, which prevents the direct adoption of existing models. To address this problem, this paper proposes a framework, HaT, to expand learning systems. It first selects multiple high-quality models from the system at a low cost and then fuses their knowledge by assigning sample-wise weights to their predictions. Later, the fused knowledge is selectively injected into the customized models based on the knowledge quality. Extensive experiments on different tasks, modalities, and settings show that HaT outperforms state-of-the-art baselines by up to 16.5% accuracy and saves up to 39% communication traffic.