NaviQAte: Functionality-Guided Web Application Navigation
This work addresses automated web application testing for developers, offering a novel method that improves adaptability in dynamic environments, though it is incremental as it builds on existing datasets and models.
The paper tackles the challenge of exploring diverse web application functionalities in end-to-end web testing by introducing NaviQAte, which frames exploration as a question-and-answer task and achieves a 44.23% success rate in user task navigation and a 38.46% success rate in functionality navigation, representing improvements of 15% and 33% over the baseline WebCanvas.
End-to-end web testing is challenging due to the need to explore diverse web application functionalities. Current state-of-the-art methods, such as WebCanvas, are not designed for broad functionality exploration; they rely on specific, detailed task descriptions, limiting their adaptability in dynamic web environments. We introduce NaviQAte, which frames web application exploration as a question-and-answer task, generating action sequences for functionalities without requiring detailed parameters. Our three-phase approach utilizes advanced large language models like GPT-4o for complex decision-making and cost-effective models, such as GPT-4o mini, for simpler tasks. NaviQAte focuses on functionality-guided web application navigation, integrating multi-modal inputs such as text and images to enhance contextual understanding. Evaluations on the Mind2Web-Live and Mind2Web-Live-Abstracted datasets show that NaviQAte achieves a 44.23% success rate in user task navigation and a 38.46% success rate in functionality navigation, representing a 15% and 33% improvement over WebCanvas. These results underscore the effectiveness of our approach in advancing automated web application testing.