Cognitive Activation and Chaotic Dynamics in Large Language Models: A Quasi-Lyapunov Analysis of Reasoning Mechanisms
This research provides a chaos theory framework for interpreting LLM reasoning, potentially aiding in balancing creativity and reliability in model design, but it is incremental as it builds on existing dynamic systems concepts.
The paper tackled the problem of understanding the reasoning mechanisms in Large Language Models by proposing the Cognitive Activation theory, which frames reasoning as a chaotic process in dynamic systems, and introduced the Quasi-Lyapunov Exponent to quantify chaotic characteristics, finding that information accumulation follows a nonlinear exponential law and MLP layers contribute more to outputs than attention mechanisms.
The human-like reasoning capabilities exhibited by Large Language Models (LLMs) challenge the traditional neural network theory's understanding of the flexibility of fixed-parameter systems. This paper proposes the "Cognitive Activation" theory, revealing the essence of LLMs' reasoning mechanisms from the perspective of dynamic systems: the model's reasoning ability stems from a chaotic process of dynamic information extraction in the parameter space. By introducing the Quasi-Lyapunov Exponent (QLE), we quantitatively analyze the chaotic characteristics of the model at different layers. Experiments show that the model's information accumulation follows a nonlinear exponential law, and the Multilayer Perceptron (MLP) accounts for a higher proportion in the final output than the attention mechanism. Further experiments indicate that minor initial value perturbations will have a substantial impact on the model's reasoning ability, confirming the theoretical analysis that large language models are chaotic systems. This research provides a chaos theory framework for the interpretability of LLMs' reasoning and reveals potential pathways for balancing creativity and reliability in model design.