MAGNET: Towards Adaptive GUI Agents with Memory-Driven Knowledge Evolution
This addresses the challenge of maintaining autonomous task execution in evolving software environments for users and developers of mobile GUI agents, representing an incremental advance.
The paper tackles the problem of mobile GUI agents failing due to frequent UI updates by introducing MAGNET, a memory-driven adaptive agent framework that links visual features to stable functional semantics and task intents, resulting in substantial improvements over baselines in AndroidWorld evaluations.
Mobile GUI agents powered by large foundation models enable autonomous task execution, but frequent updates altering UI appearance and reorganizing workflows cause agents trained on historical data to fail. Despite surface changes, functional semantics and task intents remain fundamentally stable. Building on this insight, we introduce MAGNET, a memory-driven adaptive agent framework with dual-level memory: stationary memory linking diverse visual features to stable functional semantics for robust action grounding and procedural memory capturing stable task intents across varying workflows. We propose a dynamic memory evolution mechanism that continuously refines both memories by prioritizing frequently accessed knowledge. Online benchmark AndroidWorld evaluations show substantial improvements over baselines, while offline benchmarks confirm consistent gains under distribution shifts. These results validate that leveraging stable structures across interface changes improves agent performance and generalization in evolving software environments.