Towards Math Terms Disambiguation Using Machine Learning

Ruocheng Shan,Abdou Youssef

INTELLIGENT COMPUTER MATHEMATICS (CICM 2021)(2021)

引用 2|浏览4
暂无评分
摘要
Word disambiguation has been an important task in natural language processing. However, the problem of disambiguation is still less explored in mathematical text. Similar to natural languages, some math terms are not assigned a unique interpretation. As math text is an important part of the scientific literature, an accurate and efficient way of performing disambiguation of math terms will be a significant contribution. In this paper, we present some investigations on math-term disambiguation using machine learning. All experimental data are selected from the DLMF dataset. Our experiments consist of 3 steps: (1) create a labeled dataset of math equations (from the DLMF) where the instances are (math token, token meaning) pairs, grouped by equation; (2) build machine learning models and train them using our labeled dataset, and (3) evaluate and compare the performance of our models using different evaluation metrics. Our results show that machine learning is an effective approach to math-term disambiguation. The accuracy of our models ranges from 70% to 85%. There is potential for considerable improvements once we have much larger labeled datasets with more balanced classes.
更多
查看译文
关键词
Math-term, LATEX, Disambiguation, Mathematical equations, Machine Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要