CLLGNov 17, 2023

Token-Level Adaptation of LoRA Adapters for Downstream Task Generalization

arXiv:2311.10847v218 citationsh-index: 1
Originality Incremental advance
AI Analysis

This work addresses the challenge of task generalization for smaller language models, though it appears incremental as it builds on existing LoRA and mixture-of-expert concepts.

The paper tackles the problem of adapting LoRA adapters for downstream tasks in smaller language models, achieving performance improvements over the base Llama-2-7b model across mathematical, scientific, reading comprehension, and coding tasks, with the best results from adapting every-other token during inference.

This paper introduces a method for adapting LoRA adapters in smaller-sized language models to arbitrary downstream tasks. Unlike standard mixture-of-expert architectures, our method employs a gradient-free routing function to choose a weighted combination of experts without increasing the compute requirements for training or inference. The results show that token-level adaptation of LoRA adapters outperforms the base Llama-2-7b model across mathematical (GSM8K), scientific (ARC-Challenge), reading comprehension (SQuAD), and coding (CodeAlpaca-20k) tasks. Further evaluations also show that the average performance of token-level adaptation outperforms individual models fine-tuned for each of the tasks with the best performance observed in adaptation of every-other token during inference. The code for this study is made available through a public repository.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes