Scene Change Captioning in Real Scenarios.

International Conference on Human-Computer Interaction (HCI International)(2022)

引用 1|浏览6
暂无评分
摘要
This paper discusses the scene change captioning task that describes scene changes using natural language for real scenarios. Most current three-dimensional understanding tasks focus on recognizing static scenes. Despite its importance in a variety of real environment applications, scene change understanding remains less discussed. Existing change understanding methods discussed in robotics focus on change detection and lack the ability to perform detailed recognition of scene changes. Most previous experiments on change captioning methods were conducted on simulation datasets with limited visual complexity, limiting their availability for real scenarios. To solve the above issues, we propose a scene change captioning dataset with scenes photographed using RGB-D cameras. We also propose an automatic simulation dataset generation process, aiming for training models transferring to real scenarios. We conducted experiments with various input modalities and proposed a method that integrates different input modalities using an attention mechanism over modalities and dynamic attention to select related information during the sentence generation process. The experimental results show that models trained on the proposed simulation dataset obtained promising results on real scenario dataset, indicating the proposed dataset generation process's practicality in real scenarios. The proposed multimodality integrating method can generate change captions with high change type and object attribute accuracy while showing robustness in real scenarios. We hope our work can open a door for future research on scene change understanding in real scenarios.
更多
查看译文
关键词
Computer vision,Deep learning,Natural language processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要