LG AI CLJun 9, 2025

Graph-of-Causal Evolution: Challenging Chain-of-Model for Reasoning

arXiv:2506.07501v1

Originality Incremental advance

AI Analysis

This addresses a bottleneck in transformer-based reasoning for AI researchers, though it appears incremental as it builds on existing chain-of-model frameworks.

The paper tackles the problem of long-range dependency loss in chain-of-model reasoning by proposing Graph-of-Causal Evolution (GoCE), which maps token representations to a sparse causal adjacency matrix and uses causal-masked attention and causal-MoE to permeate constraints, resulting in improved capture of long-range causal dependencies and self-evolution ability compared to baseline LLMs on datasets like CLUTRR, CLADDER, EX-FEVER, and CausalQA.

In view of the problem that each subchain in the chain-of-model (CoM) relies only on the information of the previous subchain and may lose long-range dependencies due to the causal mask blocking the global context flow between multi-level subchains, this work proposes a graph of causal evolution (GoCE). Its core principle is to map the implicit token representation into a differentiable and sparse causal adjacency matrix, then permeate causal constraints through each layer of calculation using causal-masked attention and causal-MoE. By combining intervention consistency loss test and self-evolution gate, the dynamic balance between causal structure learning and adaptive updating of transformer architecture is realized. The researcher built experimental environments in sandboxes built with Claude Sonnet 4, o4-mini-high, and DeepSeek R1 respectively with the transformer variant architecture introduced in GoCE. It is evaluated on publicly available datasets including CLUTRR, CLADDER, EX-FEVER, and CausalQA and compared with the baseline LLMs. The finding proves that GoCE strengthens the transformer's ability to capture long-range causal dependencies, while the ability to self-evolve is improved. It not only surpasses the design of CoM in terms of design principles, but also provides experience for future research on causal learning and continuous adaptive improvement.

View on arXiv PDF

Similar