SECLAug 15, 2019

Semantic Source Code Search: A Study of the Past and a Glimpse at the Future

arXiv:1908.06738v25 citations
AI Analysis

This is an incremental study that addresses the problem of inefficient code search for developers dealing with large and complex codebases.

The paper reviews existing methods for building source code search engines and identifies their limitations in handling high-level natural language queries, while outlining open research directions and obstacles toward a universal solution.

With the recent explosion in the size and complexity of source codebases and software projects, the need for efficient source code search engines has increased dramatically. Unfortunately, existing information retrieval-based methods fail to capture the query semantics and perform well only when the query contains syntax-based keywords. Consequently, such methods will perform poorly when given high-level natural language queries. In this paper, we review existing methods for building code search engines. We also outline the open research directions and the various obstacles that stand in the way of having a universal source code search engine.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes