CI-Net: a joint depth estimation and semantic segmentation network using contextual information

Applied Intelligence(2022)

引用 4|浏览56
暂无评分
摘要
Monocular depth estimation and semantic segmentation are two fundamental goals of scene understanding. Due to the advantages of task interaction, many works have studied the joint-task learning algorithm. However, most existing methods fail to fully leverage the semantic labels, ignoring the provided context structures and only using them to supervise the prediction of segmentation split, which limits the performance of both tasks. In this paper, we propose a network injected with contextual information (CI-Net) to solve this problem. Specifically, we introduce a self-attention block in the encoder to generate an attention map. With supervision from the ideal attention map created by semantic label, the network is embedded with contextual information so that it could understand the scene better and utilize correlated features to make accurate prediction. Besides, a feature-sharing module (FSM) is constructed to make the task-specific features deeply fused, and a consistency loss is devised to ensure that the features mutually guided. We extensively evaluate the proposed CI-Net on NYU-Depth-v2, SUN-RGBD, and Cityscapes datasets. The experimental results validate that our proposed CI-Net could effectively improve the accuracy of semantic segmentation and depth estimation.
更多
查看译文
关键词
Depth estimation, Semantic segmentation, Attention mechanism, Task interaction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要