Synthesizing Machine Learning Programs with PAC Guarantees via Statistical Sketching
This work addresses the challenge of ensuring reliability in programs with ML components, which is crucial for safety-critical applications like autonomous systems or healthcare, though it appears incremental in applying statistical methods to program synthesis.
The paper tackles the problem of synthesizing programs that incorporate machine learning components, such as deep neural networks, with statistical guarantees like high-probability correctness. It proposes novel algorithms based on statistical learning theory and demonstrates their effectiveness in synthesizing programs for tasks like image classification and precision medicine, achieving probabilistic guarantees.
We study the problem of synthesizing programs that include machine learning components such as deep neural networks (DNNs). We focus on statistical properties, which are properties expected to hold with high probability -- e.g., that an image classification model correctly identifies people in images with high probability. We propose novel algorithms for sketching and synthesizing such programs by leveraging ideas from statistical learning theory to provide statistical soundness guarantees. We evaluate our approach on synthesizing list processing programs that include DNN components used to process image inputs, as well as case studies on image classification and on precision medicine. Our results demonstrate that our approach can be used to synthesize programs with probabilistic guarantees.