PLSEJun 22, 2021

SynGuar: Guaranteeing Generalization in Programming by Example

arXiv:2106.11610v1
Originality Highly original
AI Analysis

This addresses the challenge of overfitting in program synthesis for applications like program repair, offering a provable guarantee that is incremental over existing methods.

The authors tackled the problem of ensuring synthesized programs generalize well in Programming by Example (PBE) by proposing SynGuar, a framework that guarantees low generalization error with high probability, often requiring only a few hundred examples to bound error below 5% with ≥98% probability.

Programming by Example (PBE) is a program synthesis paradigm in which the synthesizer creates a program that matches a set of given examples. In many applications of such synthesis (e.g., program repair or reverse engineering), we are to reconstruct a program that is close to a specific target program, not merely to produce some program that satisfies the seen examples. In such settings, we wish that the synthesized program generalizes well, i.e., has as few errors as possible on the unobserved examples capturing the target function behavior. In this paper, we propose the first framework (called SynGuar) for PBE synthesizers that guarantees to achieve low generalization error with high probability. Our main contribution is a procedure to dynamically calculate how many additional examples suffice to theoretically guarantee generalization. We show how our techniques can be used in 2 well-known synthesis approaches: PROSE and STUN (synthesis through unification), for common string-manipulation program benchmarks. We find that often a few hundred examples suffice to provably bound generalization error below $5\%$ with high ($\geq 98\%$) probability on these benchmarks. Further, we confirm this empirically: SynGuar significantly improves the accuracy of existing synthesizers in generating the right target programs. But with fewer examples chosen arbitrarily, the same baseline synthesizers (without SynGuar) overfit and lose accuracy.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes