LGMay 21, 2022

NS3: Neuro-Symbolic Semantic Code Search

CambridgeMicrosoft
arXiv:2205.10674v214 citationsh-index: 42
Originality Incremental advance
AI Analysis

This work addresses the challenge of improving code retrieval accuracy for developers and researchers, particularly for complex queries, though it appears incremental as it builds on existing neural methods with a modular design.

The paper tackles the problem of semantic code search by addressing limitations of current language models in handling compositional text and multi-step reasoning, proposing a neuro-symbolic approach that supplements queries with semantic layouts to break down reasoning into lower-level decisions, resulting in more precise code retrieval as demonstrated on datasets like CodeSearchNet and Code Search and Question Answering.

Semantic code search is the task of retrieving a code snippet given a textual description of its functionality. Recent work has been focused on using similarity metrics between neural embeddings of text and code. However, current language models are known to struggle with longer, compositional text, and multi-step reasoning. To overcome this limitation, we propose supplementing the query sentence with a layout of its semantic structure. The semantic layout is used to break down the final reasoning decision into a series of lower-level decisions. We use a Neural Module Network architecture to implement this idea. We compare our model - NS3 (Neuro-Symbolic Semantic Search) - to a number of baselines, including state-of-the-art semantic code retrieval methods, and evaluate on two datasets - CodeSearchNet and Code Search and Question Answering. We demonstrate that our approach results in more precise code retrieval, and we study the effectiveness of our modular design when handling compositional queries.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes