CVIVJun 25, 2025

WaRA: Wavelet Low Rank Adaptation

arXiv:2506.24092v1h-index: 12Has Code
Originality Incremental advance
AI Analysis

This work addresses the problem of inefficient fine-tuning in machine learning, particularly for vision applications, by introducing a novel adaptation technique that improves flexibility and sparsity, though it is incremental as it builds on established PEFT methods.

The paper tackles the limitation of existing parameter-efficient fine-tuning methods like LoRA, which use global low-rank factorizations and miss local or multi-scale structures, by proposing WaRA, a method that uses wavelet transforms for multi-resolution decomposition, resulting in superior performance on vision tasks with enhanced image quality and reduced computational complexity.

Parameter-efficient fine-tuning (PEFT) has gained widespread adoption across various applications. Among PEFT techniques, Low-Rank Adaptation (LoRA) and its extensions have emerged as particularly effective, allowing efficient model adaptation while significantly reducing computational overhead. However, existing approaches typically rely on global low-rank factorizations, which overlook local or multi-scale structure, failing to capture complex patterns in the weight updates. To address this, we propose WaRA, a novel PEFT method that leverages wavelet transforms to decompose the weight update matrix into a multi-resolution representation. By performing low-rank factorization in the wavelet domain and reconstructing updates through an inverse transform, WaRA obtains compressed adaptation parameters that harness multi-resolution analysis, enabling it to capture both coarse and fine-grained features while providing greater flexibility and sparser representations than standard LoRA. Through comprehensive experiments and analysis, we demonstrate that WaRA performs superior on diverse vision tasks, including image generation, classification, and semantic segmentation, significantly enhancing generated image quality while reducing computational complexity. Although WaRA was primarily designed for vision tasks, we further showcase its effectiveness in language tasks, highlighting its broader applicability and generalizability. The code is publicly available at \href{GitHub}{https://github.com/moeinheidari7829/WaRA}.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes