LGAIOct 30, 2025

Loquetier: A Virtualized Multi-LoRA Framework for Unified LLM Fine-tuning and Serving

arXiv:2511.00101v1h-index: 2Has Code
Originality Incremental advance
AI Analysis

This addresses the gap in efficient integration of fine-tuning and serving for parameter-efficient LLM adaptation, offering a practical solution for AI practitioners, though it is incremental as it builds on existing LoRA techniques.

The paper tackles the problem of unifying fine-tuning and inference for LoRA-based large language models by introducing Loquetier, a virtualized multi-LoRA framework that achieves up to 3.0x throughput on inference tasks and 46.4x higher SLO attainment on unified tasks compared to existing systems.

Low-Rank Adaptation (LoRA) has become a widely adopted parameter-efficient fine-tuning (PEFT) technique for adapting large language models (LLMs) to downstream tasks. While prior work has explored strategies for integrating LLM training and serving, there still remains a gap in unifying fine-tuning and inference for LoRA-based models. We present Loquetier, a virtualized multi-LoRA framework that seamlessly integrates LoRA fine-tuning and serving within a single runtime. Loquetier introduces two key components: (1) a Virtualized Module that isolates PEFT-based modifications and supports multiple adapters on a shared base model, and (2) an optimized computation flow with a kernel design that merges fine-tuning and inference paths in forward propagation, enabling efficient batching and minimizing kernel invocation overhead. Extensive experiments across three task settings show that Loquetier consistently outperforms existing baselines in both performance and flexibility, achieving up to $3.0\times$ the throughput of the state-of-the-art co-serving system on inference-only tasks and $46.4\times$ higher SLO attainment than PEFT on unified fine-tuning and inference tasks. The implementation of Loquetier is publicly available at https://github.com/NJUDeepEngine/Loquetier.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes