Procedural Knowledge Improves Agentic LLM Workflows
This addresses the challenge of improving LLM performance on agentic tasks for AI developers, though it is incremental as it builds on existing procedural knowledge methods.
The paper tackles the problem of LLMs struggling with agentic tasks by formalizing and evaluating a workflow that uses hierarchical task networks (HTNs) as procedural knowledge, showing that hand-coded HTNs boost smaller LLMs (20b or 70b parameters) to outperform a larger 120b parameter baseline.
Large language models (LLMs) often struggle when performing agentic tasks without substantial tool support, prom-pt engineering, or fine tuning. Despite research showing that domain-dependent, procedural knowledge can dramatically increase planning efficiency, little work evaluates its potential for improving LLM performance on agentic tasks that may require implicit planning. We formalize, implement, and evaluate an agentic LLM workflow that leverages procedural knowledge in the form of a hierarchical task network (HTN). Empirical results of our implementation show that hand-coded HTNs can dramatically improve LLM performance on agentic tasks, and using HTNs can boost a 20b or 70b parameter LLM to outperform a much larger 120b parameter LLM baseline. Furthermore, LLM-created HTNs improve overall performance, though less so. The results suggest that leveraging expertise--from humans, documents, or LLMs--to curate procedural knowledge will become another important tool for improving LLM workflows.