CVDec 23, 2020

ConvMath: A Convolutional Sequence Network for Mathematical Expression Recognition

arXiv:2012.12619v120 citations
AI Analysis

This work addresses the challenge of recognizing two-dimensional mathematical expressions for researchers and practitioners working with OCR and document digitization, offering an incremental improvement in both accuracy and efficiency.

This paper introduces ConvMath, a convolutional sequence network designed to convert mathematical expressions from images into LaTeX sequences. The model achieves state-of-the-art accuracy and significantly improved efficiency on the IM2LATEX-100K dataset compared to previous methods.

Despite the recent advances in optical character recognition (OCR), mathematical expressions still face a great challenge to recognize due to their two-dimensional graphical layout. In this paper, we propose a convolutional sequence modeling network, ConvMath, which converts the mathematical expression description in an image into a LaTeX sequence in an end-to-end way. The network combines an image encoder for feature extraction and a convolutional decoder for sequence generation. Compared with other Long Short Term Memory(LSTM) based encoder-decoder models, ConvMath is entirely based on convolution, thus it is easy to perform parallel computation. Besides, the network adopts multi-layer attention mechanism in the decoder, which allows the model to align output symbols with source feature vectors automatically, and alleviates the problem of lacking coverage while training the model. The performance of ConvMath is evaluated on an open dataset named IM2LATEX-100K, including 103556 samples. The experimental results demonstrate that the proposed network achieves state-of-the-art accuracy and much better efficiency than previous methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes