Zachary Eberhart

SE
7papers
83citations
Novelty34%
AI Score20

7 Papers

SEJan 24, 2022
Generating Clarifying Questions for Query Refinement in Source Code Search

Zachary Eberhart, Collin McMillan

In source code search, a common information-seeking strategy involves providing a short initial query with a broad meaning, and then iteratively refining the query using terms gleaned from the results of subsequent searches. This strategy requires programmers to spend time reading search results that are irrelevant to their development needs. In contrast, when programmers seek information from other humans, they typically refine queries by asking and answering clarifying questions. Clarifying questions have been shown to benefit general-purpose search engines, but have not been examined in the context of code search. We present a method for generating natural-sounding clarifying questions using information extracted from function names and comments. Our method outperformed a keyword-based method for single-turn refinement in synthetic studies, and was associated with shorter search duration in human studies.

SEJul 26, 2021
Dialogue Management for Interactive API Search

Zachary Eberhart, Collin McMillan

API search involves finding components in an API that are relevant to a programming task. For example, a programmer may need a function in a C library that opens a new network connection, then another function that sends data across that connection. Unfortunately, programmers often have trouble finding the API components that they need. A strong scientific consensus is emerging towards developing interactive tool support that responds to conversational feedback, emulating the experience of asking a fellow human programmer for help. A major barrier to creating these interactive tools is implementing dialogue management for API search. Dialogue management involves determining how a system should respond to user input, such as whether to ask a clarification question or to display potential results. In this paper, we present a dialogue manager for interactive API search that considers search results and dialogue history to select efficient actions. We implement two dialogue policies: a hand-crafted policy and a policy optimized via reinforcement learning. We perform a synthetics evaluation and a human evaluation comparing the policies to a generic single-turn, top-N policy used by source code search engines.

SEJun 23, 2021
A Wizard of Oz Study Simulating API Usage Dialogues with a Virtual Assistant

Zachary Eberhart, Aakash Bansal, Collin McMillan

Virtual Assistant technology is rapidly proliferating to improve productivity in a variety of tasks. While several virtual assistants for everyday tasks are well-known (e.g., Siri, Cortana, Alexa), assistants for specialty tasks such as software engineering are rarer. One key reason software engineering assistants are rare is that very few experimental datasets are available and suitable for training the AI that is the bedrock of current virtual assistants. In this paper, we present a set of Wizard of Oz experiments that we designed to build a dataset for creating a virtual assistant. Our target is a hypothetical virtual assistant for helping programmers use APIs. In our experiments, we recruited 30 professional programmers to complete programming tasks using two APIs. The programmers interacted with a simulated virtual assistant for help - the programmers were not aware that the assistant was actually operated by human experts. We then annotated the dialogue acts in the corpus along four dimensions: illocutionary intent, API information type(s), backward-facing function, and traceability to specific API components. We observed a diverse range of interactions that will facilitate the development of dialogue strategies for virtual assistants for API usage.

SEJan 11, 2021
A Neural Question Answering System for Basic Questions about Subroutines

Aakash Bansal, Zachary Eberhart, Lingfei Wu et al.

A question answering (QA) system is a type of conversational AI that generates natural language answers to questions posed by human users. QA systems often form the backbone of interactive dialogue systems, and have been studied extensively for a wide variety of tasks ranging from restaurant recommendations to medical diagnostics. Dramatic progress has been made in recent years, especially from the use of encoder-decoder neural architectures trained with big data input. In this paper, we take initial steps to bringing state-of-the-art neural QA technologies to Software Engineering applications by designing a context-based QA system for basic questions about subroutines. We curate a training dataset of 10.9 million question/context/answer tuples based on rules we extract from recent empirical studies. Then, we train a custom neural QA model with this dataset and evaluate the model in a study with professional programmers. We demonstrate the strengths and weaknesses of the system, and lay the groundwork for its use in eventual dialogue systems for software engineering.

SEJan 27, 2020
The Apiza Corpus: API Usage Dialogues with a Simulated Virtual Assistant

Zachary Eberhart, Aakash Bansal, Collin McMillan

Virtual assistant technology has the potential to make a significant impact in the field of software engineering. However, few SE-related datasets exist that would be suitable for the design or training of a virtual assistant. To help lay the groundwork for a hypothetical virtual assistant for API usage, we designed and conducted a Wizard-of-Oz study to gather this crucial data. We hired 30 professional programmers to complete a series of programming tasks by interacting with a simulated virtual assistant. Unbeknownst to the programmers, the virtual assistant was actually operated by another human expert. In this report, we describe our experimental methodology and summarize the results of the study.

SEDec 21, 2019
Automatically Extracting Subroutine Summary Descriptions from Unstructured Comments

Zachary Eberhart, Alexander LeClair, Collin McMillan

Summary descriptions of subroutines are short (usually one-sentence) natural language explanations of a subroutine's behavior and purpose in a program. These summaries are ubiquitous in documentation, and many tools such as JavaDocs and Doxygen generate documentation built around them. And yet, extracting summaries from unstructured source code repositories remains a difficult research problem -- it is very difficult to generate clean structured documentation unless the summaries are annotated by programmers. This becomes a problem in large repositories of legacy code, since it is cost prohibitive to retroactively annotate summaries in dozens or hundreds of old programs. Likewise, it is a problem for creators of automatic documentation generation algorithms, since these algorithms usually must learn from large annotated datasets, which do not exist for many programming languages. In this paper, we present a semi-automated approach via crowdsourcing and a fully-automated approach for annotating summaries from unstructured code comments. We present experiments validating the approaches, and provide recommendations and cost estimates for automatically annotating large repositories.

SEJun 5, 2018
Adapting Neural Text Classification for Improved Software Categorization

Alexander LeClair, Zachary Eberhart, Collin McMillan

Software Categorization is the task of organizing software into groups that broadly describe the behavior of the software, such as "editors" or "science." Categorization plays an important role in several maintenance tasks, such as repository navigation and feature elicitation. Current approaches attempt to cast the problem as text classification, to make use of the rich body of literature from the NLP domain. However, as we will show in this paper, text classification algorithms are generally not applicable off-the-shelf to source code; we found that they work well when high-level project descriptions are available, but suffer very large performance penalties when classifying source code and comments only. We propose a set of adaptations to a state-of-the-art neural classification algorithm and perform two evaluations: one with reference data from Debian end-user programs, and one with a set of C/C++ libraries that we hired professional programmers to annotate. We show that our proposed approach achieves performance exceeding that of previous software classification techniques as well as a state-of-the-art neural text classification technique.