SEMay 27, 2019

DockerizeMe: Automatic Inference of Environment Dependencies for Python Code Snippets

arXiv:1905.11127v163 citations
Originality Incremental advance
AI Analysis

This addresses a practical issue for developers using platforms like Stack Overflow and GitHub by automating dependency inference for Python code snippets.

The paper tackles the problem that most Python code snippets shared online cannot be directly executed due to missing dependencies, and presents DockerizeMe, which resolves import errors in 892 out of nearly 3,000 gists where a baseline approach failed.

Platforms like Stack Overflow and GitHub's gist system promote the sharing of ideas and programming techniques via the distribution of code snippets designed to illustrate particular tasks. Python, a popular and fast-growing programming language, sees heavy use on both sites, with nearly one million questions asked on Stack Overflow and 400 thousand public gists on GitHub. Unfortunately, around 75% of the Python example code shared through these sites cannot be directly executed. When run in a clean environment, over 50% of public Python gists fail due to an import error for a missing library. We present DockerizeMe, a technique for inferring the dependencies needed to execute a Python code snippet without import error. DockerizeMe starts with offline knowledge acquisition of the resources and dependencies for popular Python packages from the Python Package Index (PyPI). It then builds Docker specifications using a graph-based inference procedure. Our inference procedure resolves import errors in 892 out of nearly 3,000 gists from the Gistable dataset for which Gistable's baseline approach could not find and install all dependencies.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes