Approach Zero and Anserini at the CLEF-2021 ARQMath Track - Applying Substructure Search and BM25 on Operator Tree Path Tokens.

CLEF(2021)

引用 3|浏览9
暂无评分
摘要
This paper reports on substructure-aware math search system Approach Zero that is applied to our submission for ARQMath lab at CLEF 2021. We have participated in both Task 1 (math ARQ) and Task 2 (formula retrieval) this year. In addition to substructure retrieval, we have added a traditional full-text search pass based on the Anserini toolkit [1]. We use the same path features extracted from Operator Tree (OPT) to index and retrieve math formulas in Anserini, and we interpolate Anserini results with structural results from Approach Zero. Automatic and table-based keyword expansion methods for math formulas have also been explored. Additionally, we report preliminary results from using previous years’ labels and applying learning to rank for our first-stage search results. In this lab, we obtain the most effective search results in Task 2 (formula retrieval) among submissions from 7 participants including the baseline system. Our experiments have also shown a great improvement over the baseline result we produced from previous year.
更多
查看译文
关键词
operator tree path tokens,substructure search,bm25
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要