Pragmatic Instruction Following and Goal Assistance via Cooperative Language-Guided Inverse Planning
CoRR(2024)
摘要
People often give instructions whose meaning is ambiguous without further
context, expecting that their actions or goals will disambiguate their
intentions. How can we build assistive agents that follow such instructions in
a flexible, context-sensitive manner? This paper introduces cooperative
language-guided inverse plan search (CLIPS), a Bayesian agent architecture for
pragmatic instruction following and goal assistance. Our agent assists a human
by modeling them as a cooperative planner who communicates joint plans to the
assistant, then performs multimodal Bayesian inference over the human's goal
from actions and language, using large language models (LLMs) to evaluate the
likelihood of an instruction given a hypothesized plan. Given this posterior,
our assistant acts to minimize expected goal achievement cost, enabling it to
pragmatically follow ambiguous instructions and provide effective assistance
even when uncertain about the goal. We evaluate these capabilities in two
cooperative planning domains (Doors, Keys Gems and VirtualHome), finding that
CLIPS significantly outperforms GPT-4V, LLM-based literal instruction following
and unimodal inverse planning in both accuracy and helpfulness, while closely
matching the inferences and assistive judgments provided by human raters.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要