CLOct 17, 2023

VeRA: Vector-based Random Matrix Adaptation

arXiv:2310.11454v2319 citationsh-index: 26
Originality Incremental advance
AI Analysis

This addresses storage issues for deploying numerous adapted models in large-scale AI applications, representing an incremental improvement over LoRA.

The paper tackles the storage challenge of low-rank adaptation (LoRA) for finetuning large language models by introducing Vector-based Random Matrix Adaptation (VeRA), which significantly reduces trainable parameters while maintaining the same performance on benchmarks like GLUE, E2E, and image classification tasks.

Low-rank adapation (LoRA) is a popular method that reduces the number of trainable parameters when finetuning large language models, but still faces acute storage challenges when scaling to even larger models or deploying numerous per-user or per-task adapted models. In this work, we present Vector-based Random Matrix Adaptation (VeRA), which significantly reduces the number of trainable parameters compared to LoRA, yet maintains the same performance. It achieves this by using a single pair of low-rank matrices shared across all layers and learning small scaling vectors instead. We demonstrate its effectiveness on the GLUE and E2E benchmarks, image classification tasks, and show its application in instruction-tuning of 7B and 13B language models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes