LGAIApr 7

TalkLoRA: Communication-Aware Mixture of Low-Rank Adaptation for Large Language Models

arXiv:2604.0629172.41 citationsh-index: 9Has Code
AI Analysis

This work addresses a specific bottleneck in parameter-efficient fine-tuning for LLMs, offering an incremental improvement for researchers and practitioners in natural language processing.

The paper tackled the problem of unstable routing and expert dominance in Mixture-of-Experts (MoE) extensions for Low-Rank Adaptation (LoRA) in Large Language Models by proposing TalkLoRA, a communication-aware framework that introduces expert-level communication to smooth routing dynamics, resulting in consistent outperformance over vanilla LoRA and MoELoRA across diverse tasks with higher parameter efficiency and more balanced expert routing.

Low-Rank Adaptation (LoRA) enables parameter-efficient fine-tuning of Large Language Models (LLMs), and recent Mixture-of-Experts (MoE) extensions further enhance flexibility by dynamically combining multiple LoRA experts. However, existing MoE-augmented LoRA methods assume that experts operate independently, often leading to unstable routing, expert dominance. In this paper, we propose \textbf{TalkLoRA}, a communication-aware MoELoRA framework that relaxes this independence assumption by introducing expert-level communication prior to routing. TalkLoRA equips low-rank experts with a lightweight Talking Module that enables controlled information exchange across expert subspaces, producing a more robust global signal for routing. Theoretically, we show that expert communication smooths routing dynamics by mitigating perturbation amplification while strictly generalizing existing MoELoRA architectures. Empirically, TalkLoRA consistently outperforms vanilla LoRA and MoELoRA across diverse language understanding and generation tasks, achieving higher parameter efficiency and more balanced expert routing under comparable parameter budgets. These results highlight structured expert communication as a principled and effective enhancement for MoE-based parameter-efficient adaptation. Code is available at https://github.com/why0129/TalkLoRA.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes