LGPLFeb 3, 2024

Learning Structure-Aware Representations of Dependent Types

arXiv:2402.02104v24 citationsh-index: 16NIPS
AI Analysis

This work addresses the challenge of applying machine learning to dependently-typed programming languages like Agda, which is incremental in extending the Agda ecosystem into ML.

The authors tackled the problem of representing dependently-typed programs (specifically Agda code) for machine learning by creating the first extensive dataset of Agda program-proofs and proposing a novel neural architecture based on structural principles. In a premise selection evaluation, their method achieved promising results that surpassed strong baselines.

Agda is a dependently-typed programming language and a proof assistant, pivotal in proof formalization and programming language theory. This paper extends the Agda ecosystem into machine learning territory, and, vice versa, makes Agda-related resources available to machine learning practitioners. We introduce and release a novel dataset of Agda program-proofs that is elaborate and extensive enough to support various machine learning applications -- the first of its kind. Leveraging the dataset's ultra-high resolution, which details proof states at the sub-type level, we propose a novel neural architecture targeted at faithfully representing dependently-typed programs on the basis of structural rather than nominal principles. We instantiate and evaluate our architecture in a premise selection setup, where it achieves promising initial results, surpassing strong baselines.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes