LGMay 27

Law of Neural Interaction: Depth-Width Shape, Interaction Efficiency, and Generalization

arXiv:2605.2798950.4h-index: 4
Predicted impact top 45% in LG · last 90 daysOriginality Incremental advance
AI Analysis

Provides insights into model shape initialization and generalization mechanisms for LLM practitioners.

The paper introduces the concept of neural interaction, extending superposition to gradient space, and finds that under a fixed budget, good generalization correlates with efficient neural interactions. Adjusting the depth-width ratio (R_{D/W}) places models in an efficient interaction interval, and models near this interval perform better on MMLU-Pro.

The guidance of scaling laws has increased the resource demands of modern large language models (LLMs), yet it remains questionable whether these models utilize resources effectively under a fixed budget. Previous research has proved superposition as a key contributor to loss. By leveraging the Neural Feature Ansatz, we extend superposition from parameter space to gradient space and define it as neural interaction. We find that under a fixed budget, good generalization is usually accompanied by efficient neural interactions, and the model can be placed in an efficient interaction interval by adjusting its depth-width ratio ($R_{D/W}$). In addition, as the budget scales up, the efficient interaction interval of the model remains relatively stable. By comparing existing small scale dense LLMs, we observe that models operating near this interval tend to perform better on the MMLU-Pro benchmark. Our findings reveal that the $R_{D/W}$ influences resource utilization efficiency and thereby affects generalization, providing insights into model shape initialization and the understanding of model generalization mechanisms. Code for Neural Interaction Law is available at: https://anonymous.4open.science/r/Neural_Interaction_Law-D788

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes