LGDec 22, 2024

Expert Routing with Synthetic Data for Continual Learning

Yewon Byun, Sanket Vaibhav Mehta, Saurabh Garg, Emma Strubell, Michael Oberst, Bryan Wilder, Zachary C. Lipton

DeepMind

arXiv:2412.17009v32.6h-index: 27

Originality Incremental advance

AI Analysis

This addresses the challenge of adapting models across domains without data sharing, which is incremental as it builds on ensemble and synthetic data approaches.

The paper tackles the problem of catastrophic forgetting in continual learning by proposing G2D, a method that uses synthetic data to train a domain-discriminator for routing samples to domain-specific experts, and it outperforms competitive methods in vision and language tasks.

In many real-world settings, regulations and economic incentives permit the sharing of models but not data across institutional boundaries. In such scenarios, practitioners might hope to adapt models to new domains, without losing performance on previous domains (so-called catastrophic forgetting). While any single model may struggle to achieve this goal, learning an ensemble of domain-specific experts offers the potential to adapt more closely to each individual institution. However, a core challenge in this context is determining which expert to deploy at test time. In this paper, we propose Generate to Discriminate (G2D), a domain-incremental continual learning method that leverages synthetic data to train a domain-discriminator that routes samples at inference time to the appropriate expert. Surprisingly, we find that leveraging synthetic data in this capacity is more effective than using the samples to \textit{directly} train the downstream classifier (the more common approach to leveraging synthetic data in the lifelong learning literature). We observe that G2D outperforms competitive domain-incremental learning methods on tasks in both vision and language modalities, providing a new perspective on the use of synthetic data in the lifelong learning literature.

View on arXiv PDF

Similar