LGMay 7

Temporal Attention for Adaptive Control of Euler-Lagrange Systems with Unobservable Memory

arXiv:2605.068775.9
Predicted impact top 95% in LG · last 90 daysOriginality Incremental advance
AI Analysis

For robotics and adaptive control, this work tackles the challenge of non-Markovian friction dynamics, but the approach is incremental and limited to specific memory regimes.

The paper addresses adaptive control of Euler-Lagrange systems with unobservable memory in friction, proposing a meta-control architecture using self-attention to generate controller gains. In short and matched memory regimes, the single-layer attention meta-controller reduces tracking error by 12 and 19 percentage points over a deeper Transformer baseline, but fails in long memory regimes, motivating dynamic head-count adjustment.

Adaptive control of Euler-Lagrange systems is challenging when friction is governed by a finite-horizon internal state that is not directly observable from joint measurements. In this setting, the measured closed-loop state is no longer Markovian, and standard certainty-equivalence adaptive laws may lose their convergence guarantees. The paper proposes a meta-control architecture in which the gains of a computed-torque controller are generated by a self-attention block processing a short window of recent motion history. The number of attention heads is selected before policy training through a surrogate analysis of the autocovariance of the memory-state gradient along the temporal window. This surrogate is based on a temporal adaptation of an incremental rank-tracking framework previously developed by the authors. The selected head count is then fixed and used as an architectural hyperparameter in a reinforcement-learning stage, where the policy is trained under a shielded admissibility constraint. The approach is tested on a 2-DOF manipulator with nonlinear friction and variable payload. In the short and matched memory regimes, the single-layer attention-only meta-controller outperforms a deeper Transformer baseline, with tracking-error reductions of 12 and 19 percentage points, respectively. The reported effect sizes are large, with d approximately -1.1 and -2.1, and Mann-Whitney p < 0.05 in both cases. In the long memory regime, however, the advantage disappears. Four out of ten training runs show either divergence or payload-invariant policy collapse, revealing a weakness in the static Phase-1 head-count prescription. This motivates moving rank-tracking inside the reinforcement-learning loop, allowing attention heads to be pruned or grown at runtime instead of fixed before training.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes