Tabular reasoning via two-stage knowledge injection

International Journal of Machine Learning and Cybernetics（2024）

引用 0|浏览8

暂无评分

摘要

Tabular reasoning presents a significant challenge in understanding natural language queries in the context of provided tables, mainly because of the complex logical operations involved. Pre-trained language models have demonstrated their capabilities in various tasks. However, performing pre-training specifically for tabular reasoning is difficult due to the diverse range of reasoning abilities required beyond contextual understanding. In this work, we propose Tabular Reasoning with T wo- s tage K nowledge I njection ( TsKI ). TsKI consists of two components: TsKI _Stage1 and TsKI _Stage2 . The primary objective of TsKI _Stage1 is to incorporate symbolic knowledge into pre-trained language models by utilizing synthesized programs. It begins by generating high-quality programs using a specific program synthesis algorithm. Next, TsKI _Stage1 conducts pre-training on the automatically generated corpus, enabling the model to learn how to query tables using the generated programs. On the other hand, TsKI _Stage2 aims to inject step-wise knowledge into the model. It starts by decomposing natural language queries into multiple sub-queries using heuristic rules and a constituency parser. Then, it employs pre-trained language models themselves to query tables with the obtained sub-queries, obtaining intermediate results that facilitate step-wise tabular reasoning. Experimental results demonstrate the effectiveness of our proposed approach. TsKI achieves significant improvements on two well-known tabular reasoning datasets, namely TabFact and WikiTableQuestions , in both TsKI _Stage1 and TsKI _Stage2 . Furthermore, in-depth analysis validates the effectiveness of each component of our approach. The code is available at https://github.com/qshi95/TsKI .

查看译文

关键词

Tabular reasoning,Natural language processing,Pre-trained language model,Knowledge injection

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要