LG MLJul 9, 2020

Learning Graph Structure With A Finite-State Automaton Layer

Daniel D. Johnson, Hugo Larochelle, Daniel Tarlow

arXiv:2007.04929v28.517 citations

Originality Highly original

AI Analysis

This work addresses the need for automated relational learning in graph-based neural networks, particularly for domains like program analysis, though it is incremental as it builds on existing graph models with a novel layer.

The paper tackles the problem of learning abstract relational structures from intrinsic graph data by introducing a differentiable Graph Finite-State Automaton (GFSA) layer that learns finite-state automata policies end-to-end, demonstrating improved performance on tasks like grid-world shortcuts and program understanding compared to hand-engineered or baseline methods.

Graph-based neural network models are producing strong results in a number of domains, in part because graphs provide flexibility to encode domain knowledge in the form of relational structure (edges) between nodes in the graph. In practice, edges are used both to represent intrinsic structure (e.g., abstract syntax trees of programs) and more abstract relations that aid reasoning for a downstream task (e.g., results of relevant program analyses). In this work, we study the problem of learning to derive abstract relations from the intrinsic graph structure. Motivated by their power in program analyses, we consider relations defined by paths on the base graph accepted by a finite-state automaton. We show how to learn these relations end-to-end by relaxing the problem into learning finite-state automata policies on a graph-based POMDP and then training these policies using implicit differentiation. The result is a differentiable Graph Finite-State Automaton (GFSA) layer that adds a new edge type (expressed as a weighted adjacency matrix) to a base graph. We demonstrate that this layer can find shortcuts in grid-world graphs and reproduce simple static analyses on Python programs. Additionally, we combine the GFSA layer with a larger graph-based model trained end-to-end on the variable misuse program understanding task, and find that using the GFSA layer leads to better performance than using hand-engineered semantic edges or other baseline methods for adding learned edge types.

View on arXiv PDF

Similar