AIJun 2

When to Re-Plan: Subgoal Persistence in Hierarchical Latent Reasoning

arXiv:2606.0374136.2h-index: 4

Predicted impact top 88% in AI · last 90 daysOriginality Incremental advance

AI Analysis

For researchers building hierarchical latent reasoning systems, this work identifies subgoal persistence as a critical design principle for compositional planning.

The paper studies the stability-adaptivity tradeoff in latent reasoning, finding that moderate subgoal persistence periods (P in [3,6]) significantly improve performance on ARC/ConceptARC, achieving a minimum LM loss of 1.544 at P=3 compared to 1.674 at P=1 and 1.640 baseline.

Long-horizon reasoning requires a system to commit to medium-horizon intent without becoming rigid: re-plan too often and computation never coheres into multi-step structure; commit too long and the plan goes stale. We study this stability-adaptivity tradeoff in the latent reasoning setting, where multi-step computation occurs inside hidden state rather than externalized token traces. We extend the Hierarchical Reasoning Model (HRM) with a feudal-style manager-worker interface: a slow high-level module periodically emits a normalized directional subgoal that persists for P low-level steps, biasing the worker's hidden-state updates and supplying an intrinsic cosine alignment loss. On ARC and ConceptARC, we find that subgoal persistence -- not subgoal injection alone -- is the central knob: moderate periods P in [3, 6] consistently outperform both very frequent (P=1) and very long horizons, with a clear minimum LM loss at P=3 (1.544 vs. 1.674 at P=1, 1.640 baseline; replicated over 5 seeds at mean 1.595, std 0.045). The intrinsic alignment weight lambda shows a complementary narrow optimum (lambda approximately 0.05). A controlled ablation at past-sweet-spot lambda isolates learned directional structure -- not architectural capacity or auxiliary loss alone -- as the source of interference when the alignment signal exceeds its optimum. Together these findings implicate a design principle for compositional planning in latent reasoning systems: medium-horizon intent must be coherent across enough computational steps for compositional structure to form.

View on arXiv PDF

Similar