Spatial Attention and Syntax Rule Enhanced Tree Decoder for Offine Handwritten Mathematical Expression Recognition
This work addresses a domain-specific problem in mathematical expression recognition, offering incremental improvements over existing tree decoder methods.
The paper tackles the problem of offline handwritten mathematical expression recognition by proposing a model that integrates spatial attention and syntax rules to reduce tree node prediction errors and enforce grammatical constraints, achieving improved recognition performance on CROHME datasets.
Offline Handwritten Mathematical Expression Recognition (HMER) has been dramatically advanced recently by employing tree decoders as part of the encoder-decoder method. Despite the tree decoder-based methods regard the expressions as a tree and parse 2D spatial structure to the tree nodes sequence, the performance of existing works is still poor due to the inevitable tree nodes prediction errors. Besides, they lack syntax rules to regulate the output of expressions. In this paper, we propose a novel model called Spatial Attention and Syntax Rule Enhanced Tree Decoder (SS-TD), which is equipped with spatial attention mechanism to alleviate the prediction error of tree structure and use syntax masks (obtained from the transformation of syntax rules) to constrain the occurrence of ungrammatical mathematical expression. In this way, our model can effectively describe tree structure and increase the accuracy of output expression. Experiments show that SS-TD achieves better recognition performance than prior models on CROHME 14/16/19 datasets, demonstrating the effectiveness of our model.