AICLApr 9

SkillClaw: Let Skills Evolve Collectively with Agentic Evolver

arXiv:2604.0837798.228 citations
Predicted impact top 5% in AI · last 90 daysOriginality Highly original
AI Analysis

This addresses the issue of repeated failures and inefficiencies in multi-user agent ecosystems by enabling cross-user knowledge transfer.

The paper tackles the problem of static skills in LLM agents by introducing SkillClaw, a framework for collective skill evolution that aggregates multi-user interactions to update skills automatically, resulting in significant performance improvements for Qwen3-Max on WildClawBench.

Large language model (LLM) agents such as OpenClaw rely on reusable skills to perform complex tasks, yet these skills remain largely static after deployment. As a result, similar workflows, tool usage patterns, and failure modes are repeatedly rediscovered across users, preventing the system from improving with experience. While interactions from different users provide complementary signals about when a skill works or fails, existing systems lack a mechanism to convert such heterogeneous experiences into reliable skill updates. To address these issues, we present SkillClaw, a framework for collective skill evolution in multi-user agent ecosystems, which treats cross-user and over-time interactions as the primary signal for improving skills. SkillClaw continuously aggregates trajectories generated during use and processes them with an autonomous evolver, which identifies recurring behavioral patterns and translates them into updates to the skill set by refining existing skills or extending them with new capabilities. The resulting skills are maintained in a shared repository and synchronized across users, allowing improvements discovered in one context to propagate system-wide while requiring no additional effort from users. By integrating multi-user experience into ongoing skill updates, SkillClaw enables cross-user knowledge transfer and cumulative capability improvement, and experiments on WildClawBench show that limited interaction and feedback, it significantly improves the performance of Qwen3-Max in real-world agent scenarios.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes