Room-and-Object Aware Knowledge Reasoning for Remote Embodied Referring Expression

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021(2021)

引用 54|浏览87
暂无评分
摘要
The Remote Embodied Referring Expression (REVERIE) is a recently raised task that requires an agent to navigate to and localise a referred remote object according to a high-level language instruction. Different from related VLN tasks, the key to REVERIE is to conduct goal-oriented exploration instead of strict instruction-following, due to the lack of step-by-step navigation guidance. In this paper, we propose a novel Cross-modality Knowledge Reasoning (CKR) model to address the unique challenges of this task. The CKR, based on a transformer-architecture, learns to generate scene memory tokens and utilise these informative history clues for exploration. Particularly, a Room-and-Object Aware Attention (ROAA) mechanism is devised to explicitly perceive the room- and object-type information from both linguistic and visual observations. Moreover, through incorporating commonsense knowledge, we propose a Knowledge-enabled Entity Relationship Reasoning (KERR) module to learn the internal-external correlations among room- and object-entities for agent to make proper action at each viewpoint. Evaluation on REVERIE benchmark demonstrates the superiority of the CKR model, which significantly boosts SPL and REVERIE-success rate by 64.67% and 46.05%, respectively. Code is available at: https://github.com/alloldman/CKR.
更多
查看译文
关键词
novel Cross-modality Knowledge Reasoning model,informative history clues,object-type information,incorporating commonsense knowledge,Knowledge-enabled Entity Relationship Reasoning module,object-entities,REVERIE benchmark,CKR model,REVERIE-success,Remote Embodied Referring Expression,recently raised task,referred remote object,high-level language instruction,related VLN tasks,goal-oriented exploration,strict instruction-following,step-by-step navigation guidance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要