Exploration of Neural Machine Translation in Autoformalization of Mathematics in Mizar
This work addresses the challenge of autoformalization for mathematicians and formal verification communities, but it appears incremental as it applies existing translation models to a new domain with some enhancements.
The paper tackled the problem of automatically translating informal LaTeX mathematics into formal Mizar statements using neural machine translation models, achieving results that were compared based on supervised vs. unsupervised approaches with data augmentation through a custom type-elaboration mechanism.
In this paper we share several experiments trying to automatically translate informal mathematics into formal mathematics. In our context informal mathematics refers to human-written mathematical sentences in the LaTeX format; and formal mathematics refers to statements in the Mizar language. We conducted our experiments against three established neural network-based machine translation models that are known to deliver competitive results on translating between natural languages. To train these models we also prepared four informal-to-formal datasets. We compare and analyze our results according to whether the model is supervised or unsupervised. In order to augment the data available for auto-formalization and improve the results, we develop a custom type-elaboration mechanism and integrate it in the supervised translation.