DSFNet: dynamic selection-fusion networks for video salient object detection

Jun Wang, Zhu Huang,Ziqing Huang,Miaohui Zhang, Xing Ren

Multimedia Tools and Applications(2023)

引用 0|浏览2
暂无评分
摘要
How to effectively fuse spatiotemporal clues is the key to improve the accuracy of video salient object detection. Although most existing methods have achieved great success regarding fusion strategies, the issue of reliability of spatiotemporal clues needs further investigation, and the use of unreliable spatiotemporal clues can corrupt the final saliency results. In this work, we propose a novel dynamic selection-fusion network (DSFNet) for video salient object detection, and DSFNet is jointly constructed by two branches. The one is the spatial learning network, which completes the learning of video sequences to obtain the spatial saliency of images. The other is the spatiotemporal contrast network, which creatively obtains the dynamic spatiotemporal saliency in the synchronized state by learning the video sequence and the corresponding optical flow images. To further screen and fuse the spatiotemporal clues, a series of joint modules for selection were developed, mainly including contrast transformation module (CTM), contrast analysis module (CAM) and selection guidance module (SGM) which play an important role in selecting spatiotemporal features. In addition, a fusion refinement module (FRM) is designed to further refine and enhance the input features. The experimental results show that the proposed method is significantly better than other algorithms in solving the problem of motion information distortion and spatiotemporal salient irrelevance.
更多
查看译文
关键词
Video salient object detection,Dynamic selection-fusion network,Spatiotemporal features
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要