Superposition in Transformers: A Novel Way of Building Mixture of Experts
This addresses the problem of knowledge loss for users adapting LLMs to new tasks, offering a potential solution to catastrophic forgetting, though it appears incremental as it builds on existing fine-tuning and mixture-of-experts concepts.
The paper tackles catastrophic forgetting in large language models during fine-tuning by introducing Superposition in Transformers, a novel architecture that uses autoencoders and B-spline blending to superimpose base and fine-tuned models, preserving original capabilities while adding domain-specific expertise and enabling dynamic switching.
Catastrophic forgetting remains a major challenge when adapting large language models (LLMs) to new tasks or domains. Conventional fine-tuning often overwrites existing knowledge, causing performance degradation on original tasks. We introduce Superposition in Transformers, a novel architecture that leverages autoencoders to superimpose the hidden representations of a base model and a fine-tuned model within a shared parameter space. By using B-spline-based blending coefficients and autoencoders that adaptively reconstruct hidden states based on the input data distribution, our method effectively mitigates catastrophic forgetting and enables a new paradigm of "in-model" superposition. This approach preserves original model capabilities while allowing compact domain-specific expertise to be added, and it supports dynamic switching between model states during inference.