LGFeb 23, 2024

Second-Order Fine-Tuning without Pain for LLMs:A Hessian Informed Zeroth-Order Optimizer

arXiv:2402.15173v452 citationsh-index: 72ICLR
Originality Incremental advance
AI Analysis

This work addresses memory constraints for researchers and practitioners fine-tuning LLMs, offering a more efficient optimizer, though it is incremental as it builds on existing zeroth-order methods.

The paper tackles the problem of high GPU memory usage in fine-tuning large language models (LLMs) by proposing HiZOO, a zeroth-order optimizer that uses diagonal Hessian information to handle parameter curvature heterogeneity, resulting in improved convergence, reduced training steps, and enhanced accuracy across models with 350M to 66B parameters.

Fine-tuning large language models (LLMs) with classic first-order optimizers entails prohibitive GPU memory due to the backpropagation process. Recent works have turned to zeroth-order optimizers for fine-tuning, which save substantial memory by using two forward passes. However, these optimizers are plagued by the heterogeneity of parameter curvatures across different dimensions. In this work, we propose HiZOO, a diagonal Hessian informed zeroth-order optimizer which is the first work to leverage the diagonal Hessian to enhance zeroth-order optimizer for fine-tuning LLMs. What's more, HiZOO avoids the expensive memory cost and only increases one forward pass per step. Extensive experiments on various models (350M~66B parameters) indicate that HiZOO improves model convergence, significantly reducing training steps and effectively enhancing model accuracy. Moreover, we visualize the optimization trajectories of HiZOO on test functions, illustrating its effectiveness in handling heterogeneous curvatures. Lastly, we provide theoretical proofs of convergence for HiZOO. Code is publicly available at https://anonymous.4open.science/r/HiZOO27F8.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes