The Apiza Corpus: API Usage Dialogues with a Simulated Virtual Assistant
This work addresses a data gap for researchers and developers building virtual assistants for software engineering, though it is incremental as it focuses on data collection rather than new methods.
The authors tackled the lack of datasets for virtual assistants in software engineering by conducting a Wizard-of-Oz study with 30 professional programmers to gather API usage dialogues, resulting in the creation of the Apiza Corpus.
Virtual assistant technology has the potential to make a significant impact in the field of software engineering. However, few SE-related datasets exist that would be suitable for the design or training of a virtual assistant. To help lay the groundwork for a hypothetical virtual assistant for API usage, we designed and conducted a Wizard-of-Oz study to gather this crucial data. We hired 30 professional programmers to complete a series of programming tasks by interacting with a simulated virtual assistant. Unbeknownst to the programmers, the virtual assistant was actually operated by another human expert. In this report, we describe our experimental methodology and summarize the results of the study.