MLCLLGMar 24, 2018

Equation Embeddings

arXiv:1803.09123v135 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of analyzing unique equations in scientific texts for researchers in computer science domains like NLP, IR, AI, and ML, but it is incremental as it builds on word embedding techniques.

The authors tackled the problem of representing mathematical equations semantically by developing an unsupervised method called equation embeddings, which uses surrounding word representations to model equations, and found that it outperforms existing word embedding approaches on four arXiv collections with ~98.5k equations.

We present an unsupervised approach for discovering semantic representations of mathematical equations. Equations are challenging to analyze because each is unique, or nearly unique. Our method, which we call equation embeddings, finds good representations of equations by using the representations of their surrounding words. We used equation embeddings to analyze four collections of scientific articles from the arXiv, covering four computer science domains (NLP, IR, AI, and ML) and $\sim$98.5k equations. Quantitatively, we found that equation embeddings provide better models when compared to existing word embedding approaches. Qualitatively, we found that equation embeddings provide coherent semantic representations of equations and can capture semantic similarity to other equations and to words.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes