TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision
arxiv(2024)
摘要
Numerous large language model (LLM) agents have been built for different
tasks like web navigation and online shopping due to LLM's wide knowledge and
text-understanding ability. Among these works, many of them utilize in-context
examples to achieve generalization without the need for fine-tuning, while few
of them have considered the problem of how to select and effectively utilize
these examples. Recently, methods based on trajectory-level retrieval with task
meta-data and using trajectories as in-context examples have been proposed to
improve the agent's overall performance in some sequential decision making
tasks. However, these methods can be problematic due to plausible examples
retrieved without task-specific state transition dynamics and long input with
plenty of irrelevant context. In this paper, we propose a novel framework
(TRAD) to address these issues. TRAD first conducts Thought Retrieval,
achieving step-level demonstration selection via thought matching, leading to
more helpful demonstrations and less irrelevant input noise. Then, TRAD
introduces Aligned Decision, complementing retrieved demonstration steps with
their previous or subsequent steps, which enables tolerance for imperfect
thought and provides a choice for balance between more context and less noise.
Extensive experiments on ALFWorld and Mind2Web benchmarks show that TRAD not
only outperforms state-of-the-art models but also effectively helps in reducing
noise and promoting generalization. Furthermore, TRAD has been deployed in
real-world scenarios of a global business insurance company and improves the
success rate of robotic process automation.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要