Image to LaTeX with Graph Neural Network for Mathematical Formula Recognition

DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2021, PT II(2021)

引用 12|浏览22
暂无评分
摘要
Mathematical formula recognition aims to automatically convert formula images into their structured description formats. Recently, some encoder-decoder models have been presented for this task, while they seldom explicitly consider spatial relationship among symbols. In this paper, we proposed a novel encoder-decoder model with Graph Neural Network (GNN) to translate mathematical formula images into LaTeX codes. In the proposed model, the symbols segmented from the raw image are used to build graphs based on their spatial connection. The encoder consists of Convolutional Neural Network (CNN) and GNN. CNN is utilized to extract the visual features from the whole formula or symbols, and GNN is used to transmit the spatial information embedded in the built graphs. The adopted decoder is a Recurrent Neural Network (RNN) model, which implements a language model to generate the output sentences based on the encoded features with attention mechanism. The experimental results on IM2LATEX-100K dataset demonstrated that the proposed model obtained a better performance than state-of-the-art approaches.
更多
查看译文
关键词
Mathematical formula recognition, Graph neural network, Encoder-decoder architecture, image to LaTeX
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要