IRAISEDec 17, 2024

XPath Agent: An Efficient XPath Programming Agent Based on LLM for Web Crawler

arXiv:2502.15688v18.53 citationsh-index: 1Has Code
Originality Incremental advance
AI Analysis

This addresses the time-consuming manual development of XPath queries for web crawling and GUI testing workflows, though it appears incremental as it builds on existing agent methods.

The paper tackles the problem of automating XPath query generation for web crawling and GUI testing by introducing XPath Agent, which achieves comparable performance to a state-of-the-art agent while significantly reducing token usage and improving clock-time efficiency.

We present XPath Agent, a production-ready XPath programming agent specifically designed for web crawling and web GUI testing. A key feature of XPath Agent is its ability to automatically generate XPath queries from a set of sampled web pages using a single natural language query. To demonstrate its effectiveness, we benchmark XPath Agent against a state-of-the-art XPath programming agent across a range of web crawling tasks. Our results show that XPath Agent achieves comparable performance metrics while significantly reducing token usage and improving clock-time efficiency. The well-designed two-stage pipeline allows for seamless integration into existing web crawling or web GUI testing workflows, thereby saving time and effort in manual XPath query development. The source code for XPath Agent is available at https://github.com/eavae/feilian.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes