LGMay 29, 2025

Scalable Complexity Control Facilitates Reasoning Ability of LLMs

arXiv:2505.23013v13 citationsh-index: 15
Originality Incremental advance
AI Analysis

This addresses the challenge of reliably enhancing generalizability in LLMs, though it appears incremental as it builds on existing complexity control methods.

The paper tackles the problem of improving reasoning ability in large language models by showing that model complexity control through initialization rate and weight decay adjustments improves scaling laws across different model and data sizes, with 2.4B models pretrained on 1T tokens demonstrating benchmark performance gains.

The reasoning ability of large language models (LLMs) has been rapidly advancing in recent years, attracting interest in more fundamental approaches that can reliably enhance their generalizability. This work demonstrates that model complexity control, conveniently implementable by adjusting the initialization rate and weight decay coefficient, improves the scaling law of LLMs consistently over varying model sizes and data sizes. This gain is further illustrated by comparing the benchmark performance of 2.4B models pretrained on 1T tokens with different complexity hyperparameters. Instead of fixing the initialization std, we found that a constant initialization rate (the exponent of std) enables the scaling law to descend faster in both model and data sizes. These results indicate that complexity control is a promising direction for the continual advancement of LLMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes