HFMDNet: Hierarchical Fusion and Multi-Level Decoder Network for RGB-D Salient Object Detection

Yi Luo,Feng Shao, Zhengxuan Xie, Huizhi Wang, Hangwei Chen,Baoyang Mu,Qiuping Jiang

IEEE Transactions on Instrumentation and Measurement(2024)

引用 0|浏览36
暂无评分
摘要
Vision-based measurement techniques are required in the quality inspection process of various products. However, most of the existing research methods focus on the use of a single modality (RGB image or Depth map) for defect detection. In this paper, we propose a potential defect detection technique by introducing RGB-D salient object detection (SOD) as a measurement method and presenting a Hierarchical Fusion and Multi-Level Decoder Network (HFMDNet). The key to the recently popular multi-modal SOD lies in effectively acquiring cross-modal complementary information and realizing the interaction between cross-level information. Most existing methods attempt to employ various fusion strategies for cross-modal fusion or implement feature enhancement before fusion. However, these methods ignore the hierarchical distinctions between RGB and depth maps in cross-modal fusion, resulting in suboptimal performance in some cases of challenging situations. We fully take the cross-level information interaction both in the fusion and decoding stages into account, and propose a HFMDNet. Specifically, we design a hierarchical fusion (HFM) module to compensate for modal differences between multi-modal data, including a low-level feature fusion (LFF) module and a high-level feature fusion (HFF) module. Then, a multi-level refinement decoder (MRD) is designed to enhance, refine, and decode the fusion features to generate saliency maps with high quality. In addition, we also introduce the edge features in the decoding phase as the auxiliary information to generate salient objects with clear boundaries. Extensive experiments conducted on nine publicly available datasets demonstrate that our HFMDNet delivers competitive and excellent performances.
更多
查看译文
关键词
Vision-based measurement,RGB-D salient object detection,Transformer,Multi-modal fusion,Multi-level information interaction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要