SENov 26, 2015

SWIM: Synthesizing What I Mean

arXiv:1511.08497v2171 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses the challenge for programmers in efficiently learning and using large API libraries, though it is incremental as it builds on existing search and synthesis techniques.

The paper tackles the problem of helping programmers find relevant code snippets for API-related queries by introducing SWIM, a tool that translates natural language queries into API calls using clickthrough data and synthesizes idiomatic code from open-source repositories, achieving a 70% success rate for the first suggested snippet and relevant solutions in the top 10 results for all benchmarked queries.

Modern programming frameworks come with large libraries, with diverse applications such as for matching regular expressions, parsing XML files and sending email. Programmers often use search engines such as Google and Bing to learn about existing APIs. In this paper, we describe SWIM, a tool which suggests code snippets given API-related natural language queries such as "generate md5 hash code". We translate user queries into the APIs of interest using clickthrough data from the Bing search engine. Then, based on patterns learned from open-source code repositories, we synthesize idiomatic code describing the use of these APIs. We introduce \emph{structured call sequences} to capture API-usage patterns. Structured call sequences are a generalized form of method call sequences, with if-branches and while-loops to represent conditional and repeated API usage patterns, and are simple to extract and amenable to synthesis. We evaluated SWIM with 30 common C# API-related queries received by Bing. For 70% of the queries, the first suggested snippet was a relevant solution, and a relevant solution was present in the top 10 results for all benchmarked queries. The online portion of the workflow is also very responsive, at an average of 1.5 seconds per snippet.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes