AIFeb 6, 2016

End-to-End Goal-Driven Web Navigation

arXiv:1602.02261v241 citations

Originality Incremental advance

AI Analysis

This work provides a new benchmark for assessing AI agents in web navigation, which is incremental as it builds on existing datasets and methods for focused crawling and question-answering.

The authors tackled the problem of evaluating agents' natural language understanding and planning in partially observed web environments by introducing a goal-driven web navigation benchmark, and demonstrated that neural net agents trained on their WikiNav dataset outperform traditional search engines on related tasks.

We propose a goal-driven web navigation as a benchmark task for evaluating an agent with abilities to understand natural language and plan on partially observed environments. In this challenging task, an agent navigates through a website, which is represented as a graph consisting of web pages as nodes and hyperlinks as directed edges, to find a web page in which a query appears. The agent is required to have sophisticated high-level reasoning based on natural languages and efficient sequential decision-making capability to succeed. We release a software tool, called WebNav, that automatically transforms a website into this goal-driven web navigation task, and as an example, we make WikiNav, a dataset constructed from the English Wikipedia. We extensively evaluate different variants of neural net based artificial agents on WikiNav and observe that the proposed goal-driven web navigation well reflects the advances in models, making it a suitable benchmark for evaluating future progress. Furthermore, we extend the WikiNav with question-answer pairs from Jeopardy! and test the proposed agent based on recurrent neural networks against strong inverted index based search engines. The artificial agents trained on WikiNav outperforms the engined based approaches, demonstrating the capability of the proposed goal-driven navigation as a good proxy for measuring the progress in real-world tasks such as focused crawling and question-answering.

View on arXiv PDF

Similar