OPEx: A Component-Wise Analysis of LLM-Centric Agents in Embodied Instruction Following
CoRR(2024)
摘要
Embodied Instruction Following (EIF) is a crucial task in embodied learning,
requiring agents to interact with their environment through egocentric
observations to fulfill natural language instructions. Recent advancements have
seen a surge in employing large language models (LLMs) within a
framework-centric approach to enhance performance in embodied learning tasks,
including EIF. Despite these efforts, there exists a lack of a unified
understanding regarding the impact of various components-ranging from visual
perception to action execution-on task performance. To address this gap, we
introduce OPEx, a comprehensive framework that delineates the core components
essential for solving embodied learning tasks: Observer, Planner, and Executor.
Through extensive evaluations, we provide a deep analysis of how each component
influences EIF task performance. Furthermore, we innovate within this space by
deploying a multi-agent dialogue strategy on a TextWorld counterpart, further
enhancing task performance. Our findings reveal that LLM-centric design
markedly improves EIF outcomes, identify visual perception and low-level action
execution as critical bottlenecks, and demonstrate that augmenting LLMs with a
multi-agent framework further elevates performance.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要