AIJun 2

When to Re-Plan: Subgoal Persistence in Hierarchical Latent Reasoning

arXiv:2606.0374136.2h-index: 4
Predicted impact top 88% in AI · last 90 daysOriginality Incremental advance
AI Analysis

For researchers building hierarchical latent reasoning systems, this work identifies subgoal persistence as a critical design principle for compositional planning.

The paper studies the stability-adaptivity tradeoff in latent reasoning, finding that moderate subgoal persistence periods (P in [3,6]) significantly improve performance on ARC/ConceptARC, achieving a minimum LM loss of 1.544 at P=3 compared to 1.674 at P=1 and 1.640 baseline.

Long-horizon reasoning requires a system to commit to medium-horizon intent without becoming rigid: re-plan too often and computation never coheres into multi-step structure; commit too long and the plan goes stale. We study this stability-adaptivity tradeoff in the latent reasoning setting, where multi-step computation occurs inside hidden state rather than externalized token traces. We extend the Hierarchical Reasoning Model (HRM) with a feudal-style manager-worker interface: a slow high-level module periodically emits a normalized directional subgoal that persists for P low-level steps, biasing the worker's hidden-state updates and supplying an intrinsic cosine alignment loss. On ARC and ConceptARC, we find that subgoal persistence -- not subgoal injection alone -- is the central knob: moderate periods P in [3, 6] consistently outperform both very frequent (P=1) and very long horizons, with a clear minimum LM loss at P=3 (1.544 vs. 1.674 at P=1, 1.640 baseline; replicated over 5 seeds at mean 1.595, std 0.045). The intrinsic alignment weight lambda shows a complementary narrow optimum (lambda approximately 0.05). A controlled ablation at past-sweet-spot lambda isolates learned directional structure -- not architectural capacity or auxiliary loss alone -- as the source of interference when the alignment signal exceeds its optimum. Together these findings implicate a design principle for compositional planning in latent reasoning systems: medium-horizon intent must be coherent across enough computational steps for compositional structure to form.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes