Diego Clerissi, Alessandro Vasina, Leonardo Mariani
Task-based chatbots are nowadays widely adopted software systems, usually integrated into real-world applications and communication channels, designed to assist users in completing tasks through conversational interfaces. Like any other software, even chatbots are prone to bugs. Despite their increasing pervasiveness in everyday activities, existing techniques for assessing their quality still exhibit several limitations, such as the simplicity of generated test scenarios and oracle weaknesses. In this paper, we present LexTester, an automated model-based testing technique for Amazon Lex chatbots. The technique explores the conversational space of the chatbot under test to generate a Dialog Graph of all possible interactions, from which an executable test suite is generated according to different coverage strategies. LexTester was evaluated against the state-of-the-practice chatbot testing tool Botium on five Amazon Lex chatbots, consistently outperforming it in all subjects, generating more tests with nearly double complexity, achieving overall 83-95% coverage of conversational elements, and improving fault detection effectiveness by up to four times at comparable time costs.