Symbol-Aware Reasoning with Masked Discrete Diffusion for Handwritten Mathematical Expression Recognition
This addresses the problem of exposure bias and syntactic inconsistency in HMER for applications requiring accurate recognition of diverse handwritten symbols and layouts.
The paper tackles handwritten mathematical expression recognition by replacing autoregressive models with a discrete diffusion framework that refines symbols and structural relations iteratively, achieving 5.56% CER and 60.42% EM on the MathWriting benchmark.
Handwritten Mathematical Expression Recognition (HMER) requires reasoning over diverse symbols and 2D structural layouts, yet autoregressive models struggle with exposure bias and syntactic inconsistency. We present a discrete diffusion framework that reformulates HMER as iterative symbolic refinement instead of sequential generation. Through multi-step remasking, the proposal progressively refines both symbols and structural relations, removing causal dependencies and improving structural consistency. A symbol-aware tokenization and Random-Masking Mutual Learning further enhance syntactic alignment and robustness to handwriting diversity. On the MathWriting benchmark, the proposal achieves 5.56\% CER and 60.42\% EM, outperforming strong Transformer and commercial baselines. Consistent gains on CROHME 2014--2023 demonstrate that discrete diffusion provides a new paradigm for structure-aware visual recognition beyond generative modeling.