Method Drift›Parameter-efficient fine-tuning (LoRA family)
DoRA
DoRA: Weight-Decomposed Low-Rank AdaptationParameter-efficient fine-tuning (LoRA family) · first seen Feb 14, 2024
heavily superseded — a standard baseline that newer methods routinely beat
16 papers critique it · 66 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites DoRA as a baseline.
“DoRA decomposes the model weights into their directional and magnitude components and fine-tunes both, but only the former remains low-rank.”
— LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning“This incurs (d_in^2) memory for the identity matrix alone: 32 MB at d_in=4096, 128 MB at d_in=8192 in bf16. Including the dense BA product and composed-weight copy, a single module allocates 3–4 dense [d_out, d_in] temporaries: 512 MB at d_in=8192.”
— Scaling DoRA: High-Rank Adaptation via Factored Norms and Fused Kernels“DoRA implicitly assumes that the direction of a matrix can be decomposed into per-column units, an assumption that lacks a clear theoretical grounding in matrix analysis.”
— MAP: Revisiting Weight Decomposition for Low-Rank Adaptation“DoRA relies on strict normalization, which makes it sensitive to optimization instabilities: when the adapted weight norm approaches zero, gradients can explode, destabilizing training.”
— DoRAN: Stabilizing Weight-Decomposed Low-Rank Adaptation via Noise Injection and Auxiliary Networks“Nonetheless, DoRA introduces additional parameters and over-expressive architecture compared to LoRA, which can exacerbate overfitting issues when adapting to small downstream datasets (See tab:gap).”
— BiDoRA: Bi-level Optimization-Based Weight-Decomposed Low-Rank Adaptation“These approaches primarily operate in weight space and can improve optimization and generalization under moderate ranks. A common assumption underlying these methods is that task-relevant adaptation directions can be inferred directly from the pretrained weight geometry, without explicit reference to data-induced activation patterns.”
— When Is Rank-1 Enough? Geometry-Guided Initialization for Parameter-Efficient Fine-Tuning“decomposes pretrained weights into magnitude and direction components, utilizing LoRA for directional updates, reducing trainable parameters and enhancing fine-tuning performance, though its complexity and dependence on data quality may limit its effectiveness.”
— SSMLoRA: Enhancing Low-Rank Adaptation with State Space Model“Although DoRA improves LoRA's learning capacity, its parameter count scales with the model's dimensionality since the magnitude component in DoRA is an n-dimensional trainable vector, where n represents the number of columns of the weight matrix.”
— EDoRA: Efficient Weight-Decomposed Low-Rank Adaptation via Singular Value Decomposition“This directly addresses a symptom of the scale ambiguity we identified, but it provides a heuristic fix without altering the core $BA^$ parameterization that creates the ambiguity.”
— OrthoGeoLoRA: Geometric Parameter-Efficient Fine-Tuning for Structured Social Science Concept Retrieval on theWeb“DoRA liu2024dora and LoRA+ hayou2024lora, address limitations in LoRA's training dynamics”
— The Quest for Winning Tickets in Low-Rank Adapters“Learning parameter-based adaptation methods may struggle to generalize to out-of-distribution tasks, particularly when the injection of additional parameters is suboptimally placed, potentially leading to degraded performance”
— Surgical AI Copilot: Energy-Based Fourier Gradient Low-Rank Adaptation for Surgical LLM Agent Reasoning and Planning“SoRA and DoRA both incur additional training overhead in the form of architectural modifications, importance calculations, additional regularization terms, or bespoke optimization strategies.”
— Post-Optimization Adaptive Rank Allocation for LoRA
Beaten on benchmarks
Head-to-head results where a newer method reports beating DoRA. Values are copied from the source paper's tables — verify against the cited paper.
- LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning
LoFT beats DoRA · average accuracy [LLaMA-7B, r=16]
76.08 vs 71.11
- LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning
LoFT beats DoRA · average accuracy [LLaMA2-7B, r=16]
80.46 vs 79.71
- LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning
LoFT beats DoRA · average accuracy [LLaMA3-8B, r=16]
85.63 vs 84.96
- LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning
LoFT beats DoRA · average accuracy [ViT-Base, r=16]
76.12 vs 74.74
- Maintaining Structural Integrity in Parameter Spaces for Parameter Efficient Fine-tuning
FLoRA beats DoRA · Avg [12.77M-param budget]
52.5 vs 45.8
- Maintaining Structural Integrity in Parameter Spaces for Parameter Efficient Fine-tuning
FLoRA beats DoRA · Avg [25.65M-param budget]
53.7 vs 46.9
- Maintaining Structural Integrity in Parameter Spaces for Parameter Efficient Fine-tuning
FLoRA beats DoRA · Avg [40.49M-param budget]
54.7 vs 45.0
- Maintaining Structural Integrity in Parameter Spaces for Parameter Efficient Fine-tuning
FLoRA beats DoRA · All [0.33M-param budget]
89.21 vs 88.31
- Maintaining Structural Integrity in Parameter Spaces for Parameter Efficient Fine-tuning
FLoRA beats DoRA · All [1.33M-param budget]
89.80 vs 88.49
- Maintaining Structural Integrity in Parameter Spaces for Parameter Efficient Fine-tuning
FLoRA beats DoRA · Avg [4.63%-param budget]
67.8 vs 67.6
- IGU-LoRA: Adaptive Rank Allocation via Integrated Gradients and Uncertainty-Aware Scoring
IGLoRA beats DoRA · Avg [RoBERTa-large, GLUE benchmark]
89.42 vs 88.75
- RoSA: Enhancing Parameter-Efficient Fine-Tuning via RoPE-aware Selective Adaptation in Large Language Models
RoSA beats DoRA · micro-avg(%) [Qwen2.5-7B]
85.9 vs 84.9
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- May 29, 2026
- May 28, 2026
- May 19, 2026
- May 15, 2026
- May 12, 2026
- May 11, 2026
- May 11, 2026
- May 8, 2026
- May 5, 2026
- May 5, 2026
- May 5, 2026
- RDP LoRARDP LoRA: Geometry-Driven Identification for Parameter-Efficient Adaptation in Large Language ModelsApr 21, 2026