Equation Embeddings
This addresses the challenge of analyzing unique equations in scientific texts for researchers in computer science domains like NLP, IR, AI, and ML, but it is incremental as it builds on word embedding techniques.
The authors tackled the problem of representing mathematical equations semantically by developing an unsupervised method called equation embeddings, which uses surrounding word representations to model equations, and found that it outperforms existing word embedding approaches on four arXiv collections with ~98.5k equations.
We present an unsupervised approach for discovering semantic representations of mathematical equations. Equations are challenging to analyze because each is unique, or nearly unique. Our method, which we call equation embeddings, finds good representations of equations by using the representations of their surrounding words. We used equation embeddings to analyze four collections of scientific articles from the arXiv, covering four computer science domains (NLP, IR, AI, and ML) and $\sim$98.5k equations. Quantitatively, we found that equation embeddings provide better models when compared to existing word embedding approaches. Qualitatively, we found that equation embeddings provide coherent semantic representations of equations and can capture semantic similarity to other equations and to words.