CL AIApr 22

LayerTracer: A Joint Task-Particle and Vulnerable-Layer Analysis framework for Arbitrary Large Language Model Architectures

Yuhang Wu, Qinyuan Liu, Qiuyang Zhao, Qingwei Chong

arXiv:2604.2055657.9

Predicted impact top 98% in CL · last 90 daysOriginality Incremental advance

AI Analysis

It addresses core challenges in hybrid LLM architecture design and optimization for researchers and developers, though it is incremental as it builds on existing analysis methods.

The paper tackles the problem of unclear hierarchical representation evolution and robustness bottlenecks in diverse LLM architectures by proposing LayerTracer, a framework for joint task-particle localization and vulnerable-layer analysis, showing that task particles appear in deep layers and larger models have stronger hierarchical robustness.

Currently, Large Language Models (LLMs) feature a diversified architectural landscape, including traditional Transformer, GateDeltaNet, and Mamba. However, the evolutionary laws of hierarchical representations, task knowledge formation positions, and network robustness bottleneck mechanisms in various LLM architectures remain unclear, posing core challenges for hybrid architecture design and model optimization. This paper proposes LayerTracer, an architecture-agnostic end-to-end analysis framework compatible with any LLM architecture. By extracting hidden states layer-by-layer and mapping them to vocabulary probability distributions, it achieves joint analysis of task particle localization and layer vulnerability quantification. We define the task particle as the key layer where the target token probability first rises significantly, representing the model's task execution starting point, and the vulnerable layer is defined as the layer with the maximum Jensen-Shannon (JS) divergence between output distributions before and after mask perturbation, reflecting its sensitivity to disturbances. Experiments on models of different parameter scales show that task particles mainly appear in the deep layers of the model regardless of parameter size, while larger-parameter models exhibit stronger hierarchical robustness. LayerTracer provides a scientific basis for layer division, module ratio, and gating switching of hybrid architectures, effectively optimizing model performance. It accurately locates task-effective layers and stability bottlenecks, offering universal support for LLM structure design and interpretability research.

View on arXiv PDF

Similar