RePLan: Robotic Replanning with Perception and Language Models
CoRR(2024)
摘要
Advancements in large language models (LLMs) have demonstrated their
potential in facilitating high-level reasoning, logical reasoning and robotics
planning. Recently, LLMs have also been able to generate reward functions for
low-level robot actions, effectively bridging the interface between high-level
planning and low-level robot control. However, the challenge remains that even
with syntactically correct plans, robots can still fail to achieve their
intended goals. This failure can be attributed to imperfect plans proposed by
LLMs or to unforeseeable environmental circumstances that hinder the execution
of planned subtasks due to erroneous assumptions about the state of objects.
One way to prevent these challenges is to rely on human-provided step-by-step
instructions, limiting the autonomy of robotic systems. Vision Language Models
(VLMs) have shown remarkable success in tasks such as visual question answering
and image captioning. Leveraging the capabilities of VLMs, we present a novel
framework called Robotic Replanning with Perception and Language Models
(RePLan) that enables real-time replanning capabilities for long-horizon tasks.
This framework utilizes the physical grounding provided by a VLM's
understanding of the world's state to adapt robot actions when the initial plan
fails to achieve the desired goal. We test our approach within four
environments containing seven long-horizion tasks. We find that RePLan enables
a robot to successfully adapt to unforeseen obstacles while accomplishing
open-ended, long-horizon goals, where baseline models cannot. Find more
information at https://replan-lm.github.io/replan.github.io/
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要