CLMay 13, 2017

Learning Semantic Correspondences in Technical Documentation

arXiv:1705.04815v119 citations
Originality Synthesis-oriented
AI Analysis

This addresses the challenge of automating the understanding of technical documentation for developers and AI systems, but it is incremental as it builds on existing semantic parsing methods.

The paper tackles the problem of translating textual descriptions to formal representations in technical documentation by learning semantic correspondences, reporting new baseline results on sixteen novel datasets including documentation for nine programming languages and Unix utility manuals.

We consider the problem of translating high-level textual descriptions to formal representations in technical documentation as part of an effort to model the meaning of such documentation. We focus specifically on the problem of learning translational correspondences between text descriptions and grounded representations in the target documentation, such as formal representation of functions or code templates. Our approach exploits the parallel nature of such documentation, or the tight coupling between high-level text and the low-level representations we aim to learn. Data is collected by mining technical documents for such parallel text-representation pairs, which we use to train a simple semantic parsing model. We report new baseline results on sixteen novel datasets, including the standard library documentation for nine popular programming languages across seven natural languages, and a small collection of Unix utility manuals.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes