Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data
arxiv(2023)
摘要
Chain-of-thought prompting (CoT) advances the reasoning abilities of large
language models (LLMs) and achieves superior performance in arithmetic,
commonsense, and symbolic reasoning tasks. However, most CoT studies rely on
carefully designed human-annotated rational chains to prompt the language
model, which poses challenges for real-world applications where labeled
training data is available without human-annotated rational chains. This
creates barriers to applications of CoT prompting to these general tasks. This
paper proposes a new strategy, Automate-CoT (Automatic Prompt Augmentation and
Selection with Chain-of-Thought), that can bypass human engineering of CoTs by
automatically augmenting rational chains from a small labeled dataset, and then
pruning low-quality chains to construct a candidate pool of machine-generated
rationale chains based on the labels. Finally, it selects the optimal
combination of several rationale chains from the pool for CoT prompting by
employing a variance-reduced policy gradient strategy to estimate the
significance of each example in a black-box language model. Automate-CoT
enables a quick adaptation of the CoT technique to different tasks.
Experimental results demonstrate the effectiveness of our method, where
state-of-the-art results are achieved on arithmetic reasoning (+2.7%),
commonsense reasoning (+3.4%), symbolic reasoning (+3.2%), and non-reasoning
tasks (+2.5%). Our code will be available at
https://github.com/shizhediao/automate-cot.
更多查看译文
关键词
automatic prompt augmentation,selection,data,chain-of-thought
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要