LoRA (Parameter-efficient fine-tuning (LoRA family)): heavily superseded — a standard baseline that newer methods routinely beat. 194 paper(s) critique it, 201 beat it on benchmarks — #1 of 1113 most-superseded. Sub-problem: cluster led by LoRA. Newer alternatives in the same sub-problem include Balanced LoRA, FedSmoothLoRA, FuRA, LoRA-Over, Hybrid-LoRA.

Method Drift›Parameter-efficient fine-tuning (LoRA family)

Heavily superseded#1 of 1,113 most-superseded

LoRA

LoRA: Low-Rank Adaptation of Large Language Models

Parameter-efficient fine-tuning (LoRA family) · first seen Jun 17, 2021

heavily superseded — a standard baseline that newer methods routinely beat

194 papers critique it · 201 beat it on benchmarks

What papers say

Verbatim critique sentences, each from a paper that cites LoRA as a baseline.

“While effective for fine-tuning, existing LoRA-based methods face fundamental challenges when applied to pre-training from scratch. Unlike fine-tuning, where small adaptations naturally exhibit low-rank structure, pre-training from random initialization requires full-rank weight updates to learn diverse representations across the entire parameter space. This mismatch between LoRA's low-rank assumption and pre-training's full-rank requirements results in suboptimal performance in the pre-training stage.”
— Taming Momentum: Rethinking Optimizer States Through Low-Rank Approximation
“Due to the design of MHSA, these modules include multiple heads (12 in ViT), and thus the weight updates approximated by the adapters also encompass updates for these multiple heads. However, is the pre-configured multi-head setup in Transformer necessarily essential?”
— Rethinking Low-Rank Adaptation in Vision: Exploring Head-Level Responsiveness across Diverse Tasks
“While naive LoRA significantly degrades text performance, PLoRA can be seen to preserve the original text capabilities”
— Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities
“Despite its success, LoRA and similar low-rank approaches still fall short of full fine-tuning in some settings.”
— LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning
“Existing LoRA-style methods often assume a global low-rank structure across entire weight matrices. However, such assumptions can be overly restrictive: different tasks may activate different subspaces or require localized updates that a single low-rank component fails to capture.”
— Localized LoRA: A Structured Low-Rank Approximation for Efficient Fine-Tuning
“However, the full potential of LoRA remains constrained by its inherent design limitations. Specifically, it assumes a uniform rank r for each incremental matrix, not accounting for the varying significance of weight matrices across different modules and layers.”
— Sensitivity-LoRA: Low-Load Sensitivity-Based Fine-Tuning for Large Language Models
“Yet, as discussed in Sec. sec: fur matrix based, this adaptation fails to capture the inherent complexity and spatial locality specific to convolution operations. The result is a reshaped two-dimensional structure that compromises the integrity of the original parameter space, leading to a representation that does not fully encapsulate the change of convolutional space.”
— Maintaining Structural Integrity in Parameter Spaces for Parameter Efficient Fine-tuning
“a single LoRA module projects the features of different tasks into the same dense low-dimensional space, causing interference between tasks and failing to effectively separate the knowledge of different tasks”
— CoLA: Collaborative Low-Rank Adaptation
“LoRA uniformly uses the same rank for all layers, without considering the difference across layers.”
— AutoLoRA: Automatically Tuning Matrix Ranks in Low-Rank Adaptation Based on Meta Learning
“Despite its effectiveness and popularity, recent studies have underscored that LoRA and its variants face challenges such as diminishing performance~LoRA, and slower convergence~PiSSA relative to full fine-tuning, which deteriorate further as the rank declines~MoRA,HiRA.”
— ScaLoRA: Optimally Scaled Low-Rank Adaptation for Efficient High-Rank Fine-Tuning
“However, its fixed-rank design across all layers limits flexibility and prevents adaptive capacity allocation to task-specific requirements.”
— FlexLoRA: Entropy-Guided Flexible Low-Rank Adaptation
“Recognizing the LoRA's suboptimality of rigidly applying the same rank to all layers”
— AdaRank: Disagreement Based Module Rank Prediction for Low-rank Adaptation

Beaten on benchmarks

Head-to-head results where a newer method reports beating LoRA. Values are copied from the source paper's tables — verify against the cited paper.

LoRA-Pre beats LoRA · Average [Llama-3.1-8B Adam-like]
47.05 vs 43.91
Taming Momentum: Rethinking Optimizer States Through Low-Rank Approximation
LoRA-Pre beats LoRA · Average [Llama-2-7B Adam-like]
32.15 vs 25.98
Taming Momentum: Rethinking Optimizer States Through Low-Rank Approximation
FourierFT beats LoRA · MSE [Chronos (Tiny)]
19.51 vs 19.79
Beyond LoRA: Exploring Efficient Fine-Tuning Techniques for Time Series Foundational Models
FourierFT beats LoRA · DTW [Chronos (Tiny)]
18.55 vs 19.86
Beyond LoRA: Exploring Efficient Fine-Tuning Techniques for Time Series Foundational Models
FourierFT beats LoRA · MAPE [Chronos (Tiny)]
7.76 vs 7.90
Beyond LoRA: Exploring Efficient Fine-Tuning Techniques for Time Series Foundational Models
FourierFT beats LoRA · MSE [Chronos (Small)]
19.65 vs 19.89
Beyond LoRA: Exploring Efficient Fine-Tuning Techniques for Time Series Foundational Models
FourierFT beats LoRA · DTW [Chronos (Small)]
19.98 vs 20.44
Beyond LoRA: Exploring Efficient Fine-Tuning Techniques for Time Series Foundational Models
FourierFT beats LoRA · MAPE [Chronos (Small)]
7.94 vs 8.02
Beyond LoRA: Exploring Efficient Fine-Tuning Techniques for Time Series Foundational Models
FourierFT beats LoRA · DTW [Chronos (Base)]
17.96 vs 21.06
Beyond LoRA: Exploring Efficient Fine-Tuning Techniques for Time Series Foundational Models
FourierFT beats LoRA · MAPE [Chronos (Base)]
7.86 vs 8.06
Beyond LoRA: Exploring Efficient Fine-Tuning Techniques for Time Series Foundational Models
LoRA-SIBO beats LoRA · Overall [GPT-J (6B)]
42.6 vs 37.5
SIBO: A Simple Booster for Parameter-Efficient Fine-Tuning
LoRA-SIBO beats LoRA · Overall [LLaMA (7B)]
47.5 vs 46.9
SIBO: A Simple Booster for Parameter-Efficient Fine-Tuning

Newer alternatives

Recent methods in the same sub-problem, not yet superseded in the knowledge base.