PRompt Optimization in Multi-Step Tasks (PROMST): Integrating Human Feedback and Preference Alignment
CoRR(2024)
摘要
Prompt optimization aims to find the best prompt to a large language model
(LLM) for a given task. LLMs have been successfully used to help find and
improve prompt candidates for single-step tasks. However, realistic tasks for
agents are multi-step and introduce new challenges: (1) Prompt content is
likely to be more extensive and complex, making it more difficult for LLMs to
analyze errors, (2) the impact of an individual step is difficult to evaluate,
and (3) different people may have varied preferences about task execution.
While humans struggle to optimize prompts, they are good at providing feedback
about LLM outputs; we therefore introduce a new LLM-driven discrete prompt
optimization framework that incorporates human-designed feedback rules about
potential errors to automatically offer direct suggestions for improvement. Our
framework is stylized as a genetic algorithm in which an LLM generates new
candidate prompts from a parent prompt and its associated feedback; we use a
learned heuristic function that predicts prompt performance to efficiently
sample from these candidates. This approach significantly outperforms both
human-engineered prompts and several other prompt optimization methods across
eight representative multi-step tasks (an average 27.7
to current best methods on GPT-3.5 and GPT-4, respectively). We further show
that the score function for tasks can be modified to better align with
individual preferences. We believe our work can serve as a benchmark for
automatic prompt optimization for LLM-driven multi-step tasks. Datasets and
Codes are available at https://github.com/yongchao98/PROMST. Project Page is
available at https://yongchao98.github.io/MIT-REALM-PROMST.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要