PLAIJun 22, 2020

Information-theoretic User Interaction: Significant Inputs for Program Synthesis

arXiv:2006.12638v1
Originality Highly original
AI Analysis

This addresses the efficiency of user interaction in programming-by-example systems, which are deployed in industrial products for data transformations.

The paper tackles the problem of identifying the most informative questions to ask users in interactive program synthesis systems, showing this 'significant questions problem' is hard in general and developing an information-theoretic greedy algorithm to solve it. The resulting active program learner converges in a small number of iterations on a real-world dataset of around 800 string transformation tasks.

Programming-by-example technologies are being deployed in industrial products for real-time synthesis of various kinds of data transformations. These technologies rely on the user to provide few representative examples of the transformation task. Motivated by the need to find the most pertinent question to ask the user, in this paper, we introduce the {\em significant questions problem}, and show that it is hard in general. We then develop an information-theoretic greedy approach for solving the problem. We justify the greedy algorithm using the conditional entropy result, which informally says that the question that achieves the maximum information gain is the one that we know least about. In the context of interactive program synthesis, we use the above result to develop an {\em{active program learner}} that generates the significant inputs to pose as queries to the user in each iteration. The procedure requires extending a {\em{passive program learner}} to a {\em{sampling program learner}} that is able to sample candidate programs from the set of all consistent programs to enable estimation of information gain. It also uses clustering of inputs based on features in the inputs and the corresponding outputs to sample a small set of candidate significant inputs. Our active learner is able to tradeoff false negatives for false positives and converge in a small number of iterations on a real-world dataset of %around 800 string transformation tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes