MATHSENSEI: A Tool-Augmented Large Language Model for Mathematical Reasoning
CoRR(2024)
摘要
Tool-augmented Large Language Models (TALM) are known to enhance the skillset
of large language models (LLM), thereby, leading to their improved reasoning
abilities across many tasks. While, TALMs have been successfully employed in
different question-answering benchmarks, their efficacy on complex mathematical
reasoning benchmarks, and the potential complimentary benefits offered by tools
for knowledge retrieval and mathematical equation solving, are open research
questions. In this work, we present MATHSENSEI, a tool-augmented large language
model for mathematical reasoning. Augmented with tools for knowledge retrieval
(Bing Web Search), program execution (Python), and symbolic equation solving
(Wolfram-Alpha), we study the complimentary benefits of these tools through
evaluations on mathematical reasoning datasets. We perform exhaustive ablations
on MATH,a popular dataset for evaluating mathematical reasoning on diverse
mathematical disciplines. We also conduct experiments involving well-known tool
planners to study the impact of tool sequencing on the model performance.
MATHSENSEI achieves 13.5
chain-of-thought on the MATH dataset. We further observe that TALMs are not as
effective for simpler math word problems (in GSM-8k), and the benefit increases
as the complexity and required knowledge increases (progressively over AQuA,
MMLU-Math, and higher level complex questions in MATH). The code and data are
available at https://github.com/Debrup-61/MathSensei.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要