ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline
arxiv(2024)
摘要
Large language models (LLMs) have shown excellent mastering of human
language, but still struggle in real-world applications that require
mathematical problem-solving. While many strategies and datasets to enhance
LLMs' mathematics are developed, it remains a challenge to simultaneously
maintain and improve both language and mathematical capabilities in deployed
LLM systems.In this work, we tailor the Self-Critique pipeline, which addresses
the challenge in the feedback learning stage of LLM alignment. We first train a
general Math-Critique model from the LLM itself to provide feedback signals.
Then, we sequentially employ rejective fine-tuning and direct preference
optimization over the LLM's own generations for data collection. Based on
ChatGLM3-32B, we conduct a series of experiments on both academic and our newly
created challenging dataset, MathUserEval. Results show that our pipeline
significantly enhances the LLM's mathematical problem-solving while still
improving its language ability, outperforming LLMs that could be two times
larger. Related techniques have been deployed to
ChatGLM[], an online serving LLM. Related
evaluation dataset and scripts are released at
.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要