LGPLSEAug 15, 2022

A Library for Representing Python Programs as Graphs for Machine Learning

DeepMind
arXiv:2208.07461v16 citationsh-index: 45Has Code
Originality Synthesis-oriented
AI Analysis

This provides a tool for researchers in machine learning for code, but it is incremental as it builds on existing graph representation methods.

The authors introduced an open-source Python library that constructs graph representations of Python programs for machine learning, and demonstrated its utility through a case study on millions of competitive programming submissions.

Graph representations of programs are commonly a central element of machine learning for code research. We introduce an open source Python library python_graphs that applies static analysis to construct graph representations of Python programs suitable for training machine learning models. Our library admits the construction of control-flow graphs, data-flow graphs, and composite ``program graphs'' that combine control-flow, data-flow, syntactic, and lexical information about a program. We present the capabilities and limitations of the library, perform a case study applying the library to millions of competitive programming submissions, and showcase the library's utility for machine learning research.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes