MemoAct: Atkinson-Shiffrin-Inspired Memory-Augmented Visuomotor Policy for Robotic Manipulation
This work addresses memory-dependent tasks in robotic manipulation, offering a novel approach to improve precision and robustness, though it appears incremental as it builds on existing memory-augmented methods.
The paper tackles the challenge of memory-augmented robotic policies for manipulation by proposing MemoAct, a hierarchical memory-based policy inspired by the Atkinson-Shiffrin model, which achieves superior performance in task state tracking and long-horizon retention compared to existing baselines in simulated and real-world scenarios.
Memory-augmented robotic policies are essential in handling memory-dependent tasks. However, existing approaches typically rely on simple observation window extensions, struggling to simultaneously achieve precise task state tracking and robust long-horizon retention. To overcome these challenges, inspired by the Atkinson-Shiffrin memory model, we propose MemoAct, a hierarchical memory-based policy that leverages distinct memory tiers to tackle specific bottlenecks. Specifically, lossless short-term memory ensures precise task state tracking, while compressed long-term memory enables robust long-horizon retention. To enrich the evaluation landscape, we construct MemoryRTBench based on RoboTwin 2.0, specifically tailored to assess policy capabilities in task state tracking and long-horizon retention. Extensive experiments across simulated and real-world scenarios demonstrate that MemoAct achieves superior performance compared to both existing Markovian baselines and history-aware policies. The project page is \href{https://tlf-tlf.github.io/MemoActPage/}{available}.