Generating Symbolic Reasoning Problems with Transformer GANs
This addresses data scarcity issues in symbolic reasoning domains, offering a method to enhance training datasets, though it is incremental as it builds on existing GAN and Transformer techniques.
The paper tackled the problem of generating synthetic training data for symbolic reasoning tasks, such as symbolic mathematics and temporal logic, using GANs with Transformer encoders, and found that the generated data could substitute for real data, even improving classifier difficulty with modifications.
We study the capabilities of GANs and Wasserstein GANs equipped with Transformer encoders to generate sensible and challenging training data for symbolic reasoning domains. We conduct experiments on two problem domains where Transformers have been successfully applied recently: symbolic mathematics and temporal specifications in verification. Even without autoregression, our GAN models produce syntactically correct instances. We show that the generated data can be used as a substitute for real training data when training a classifier, and, especially, that training data can be generated from a dataset that is too small to be trained on directly. Using a GAN setting also allows us to alter the target distribution: We show that by adding a classifier uncertainty part to the generator objective, we obtain a dataset that is even harder to solve for a temporal logic classifier than our original dataset.