Visual Semantic Planning Using Deep Successor Representations

2017 IEEE International Conference on Computer Vision (ICCV)(2017)

引用 160|浏览364
暂无评分
摘要
A crucial capability of real-world intelligent agents is their ability to plan a sequence of actions to achieve their goals in the visual world. In this work, we address the problem of visual semantic planning: the task of predicting a sequence of actions from visual observations that transform a dynamic environment from an initial state to a goal state. Doing so entails knowledge about objects and their affordances, as well as actions and their preconditions and effects. We propose learning these through interacting with a visual and dynamic environment. Our proposed solution involves bootstrapping reinforcement learning with imitation learning. To ensure cross task generalization, we develop a deep predictive model based on successor representations. Our experimental results show near optimal results across a wide range of tasks in the challenging THOR environment.
更多
查看译文
关键词
visual semantic planning,deep successor representations,real-world intelligent agents,visual environment,deep predictive model,bootstrapping reinforcement learning,imitation learning,cross task generalization,computer vision
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要