Do As I Can, Not As I Say: Grounding Language in Robotic Affordances

Michael Ahn, Anthony Brohan,Noah Brown,Yevgen Chebotar, Omar Cortes, Byron David,Chelsea Finn大牛学者, Chuyuan Fu, Keerthana Gopalakrishnan,Karol Hausman,Alex Herzog, Daniel Ho,Jasmine Hsu,Julian Ibarz,Brian Ichter, Alex Irpan, Eric Jang, Rosario Jauregui Ruano, Kyle Jeffrey, Sally Jesmonth,Nikhil J Joshi,Ryan Julian, Dmitry Kalashnikov,Yuheng Kuang,Kuang-Huei Lee大牛学者,Sergey Levine, Yao Lu,Linda Luu,Carolina Parada, Peter Pastor,Jornell Quiambao, Kanishka Rao, Jarek Rettinghouse,Diego Reyes, Pierre Sermanet, Nicolas Sievers,Clayton Tan,Alexander Toshev,Vincent Vanhoucke,Fei Xia,Ted Xiao大牛学者, Peng Xu, Sichun Xu, Mengyuan Yan, Andy Zeng

arxiv(2022)

引用 215|浏览249
摘要
Large language models can encode a wealth of semantic knowledge about the world. Such knowledge could be extremely useful to robots aiming to act upon high-level, temporally extended instructions expressed in natural language. However, a significant weakness of language models is that they lack real-world experience, which makes it difficult to leverage them for decision making within a given embodiment. For example, asking a language model to describe how to clean a spill might result in a reasonable narrative, but it may not be applicable to a particular agent, such as a robot, that needs to perform this task in a particular environment. We propose to provide real-world grounding by means of pretrained skills, which are used to constrain the model to propose natural language actions that are both feasible and contextually appropriate. The robot can act as the language model's "hands and eyes," while the language model supplies high-level semantic knowledge about the task. We show how low-level skills can be combined with large language models so that the language model provides high-level knowledge about the procedures for performing complex and temporally-extended instructions, while value functions associated with these skills provide the grounding necessary to connect this knowledge to a particular physical environment. We evaluate our method on a number of real-world robotic tasks, where we show the need for real-world grounding and that this approach is capable of completing long-horizon, abstract, natural language instructions on a mobile manipulator. The project's website and the video can be found at https://say-can.github.io/.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
0
您的评分 :

暂无评分

数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn