A Flawed Dataset for Symbolic Equation Verification
This work points out critical issues in a dataset intended for AI testing, suggesting it may be of limited use for advancing symbolic reasoning in machine learning.
The paper critiques a proposed dataset for symbolic equation verification, highlighting two major flaws: limited generation of true equations and artifactual features enabling easy discrimination, and notes that a simple probabilistic method already solves the verification problem with high reliability.
Arabshahi, Singh, and Anandkumar (2018) propose a method for creating a dataset of symbolic mathematical equations for the tasks of symbolic equation verification and equation completion. Unfortunately, a dataset constructed using the method they propose will suffer from two serious flaws. First, the class of true equations that the procedure can generate will be very limited. Second, because true and false equations are generated in completely different ways, there are likely to be artifactual features that allow easy discrimination. Moreover, over the class of equations they consider, there is an extremely simple probabilistic procedure that solves the problem of equation verification with extremely high reliability. The usefulness of this problem in general as a testbed for AI systems is therefore doubtful.