Task planning for visual room rearrangement under partial observability
ICLR 2024(2024)
摘要
This paper presents a novel hierarchical task planner under partial observability
that empowers an embodied agent to use visual input to efficiently plan a sequence
of actions for simultaneous object search and rearrangement in an untidy room,
to achieve a desired tidy state. The paper introduces (i) a novel Search Network
that utilizes commonsense knowledge from large language models to find unseen
objects, (ii) a Deep RL network trained with proxy reward, along with (iii) a novel
graph-based state representation to produce a scalable and effective planner that
interleaves object search and rearrangement to minimize the number of steps taken
and overall traversal of the agent, as well as to resolve blocked goal and swap
cases, and (iv) a sample-efficient cluster-biased sampling for simultaneous training
of the proxy reward network along with the Deep RL network. Furthermore,
the paper presents new metrics and a benchmark dataset - RoPOR, to measure
the effectiveness of rearrangement planning. Experimental results show that our
method significantly outperforms the state-of-the-art rearrangement methods Weihs
et al. (2021a); Gadre et al. (2022); Sarch et al. (2022); Ghosh et al. (2022).
更多查看译文
关键词
Task Planning,Object Search,Deep-RL
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要